Presentación interactiva "Exploración y Modelamiento de Datos"¶
- Dataset seleccionado: Movie metadata
- URL: https://www.kaggle.com/datasets/bobirino/movie-metadata
Variables Cuantitativas y Cualitativas¶
Variables Cuantitativas¶
| Nombre | Tipo Variable | Descripción |
|---|---|---|
| num_critic_for_reviews | Numérica continua | Número de críticas profesionales registradas |
| duration | Numérica continua | Duración de la película en minutos |
| director_facebook_likes | Numérica discreta | Número de "me gusta" que tiene el director en Facebook |
| actor_3_facebook_likes | Numérica discreta | Número de "me gusta" del tercer actor principal en Facebook |
| actor_1_facebook_likes | Numérica discreta | Número de "me gusta" del actor principal en Facebook |
| gross | Numérica continua | Ingresos brutos generados por la película (en dólares) |
| num_voted_users | Numérica discreta | Número de usuarios que han votado la película en IMDb |
| cast_total_facebook_likes | Numérica discreta | Suma total de "me gusta" del elenco principal en Facebook |
| facenumber_in_poster | Numérica discreta | Número de rostros visibles en el póster de la película |
| num_user_for_reviews | Numérica discreta | Número de reseñas escritas por usuarios |
| budget | Numérica continua | Presupuesto estimado de producción (en dólares) |
| title_year | Numérica discreta | Año de lanzamiento de la película |
| actor_2_facebook_likes | Numérica discreta | Número de "me gusta" del segundo actor principal en Facebook |
| imdb_score | Numérica continua | Puntaje promedio otorgado por usuarios en IMDb (escala 1–10) |
| aspect_ratio | Numérica continua | Relación de aspecto de la imagen (e.g., 1.85, 2.35) |
| movie_facebook_likes | Numérica discreta | Número de "me gusta" que tiene la película en Facebook |
Variables Cualitativas¶
| Nombre | Tipo Variable | Descripción |
|---|---|---|
| color | Categórica nominal | Indica si la película es en color o blanco y negro |
| director_name | Categórica nominal | Nombre del director de la película |
| actor_2_name | Categórica nominal | Nombre del segundo actor principal |
| genres | Categórica nominal | Géneros asociados a la película (puede incluir múltiples) |
| actor_1_name | Categórica nominal | Nombre del actor principal |
| movie_title | Categórica nominal | Título de la película |
| actor_3_name | Categórica nominal | Nombre del tercer actor principal |
| plot_keywords | Categórica nominal | Palabras clave que describen la trama |
| movie_imdb_link | Categórica nominal | URL del enlace a la película en IMDb |
| language | Categórica nominal | Idioma principal de la película |
| country | Categórica nominal | País de origen de la película |
| content_rating | Categórica ordinal | Clasificación de contenido (e.g., PG, R) según edad recomendada |
Definición de librerías usadas en el proyecto¶
import random
import math
import re
import os
import numpy as np
import pandas as pd
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import matplotlib.pyplot as plt
from scipy.stats import zscore
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler, MultiLabelBinarizer
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestRegressor
from sklearn.ensemble import GradientBoostingRegressor
from typing import Optional, Sequence, Tuple
from sklearn.svm import SVR
from sklearn.neighbors import KNeighborsRegressor
from sklearn.linear_model import Ridge
Funciones y Clases auxiliares¶
Función de formateo de decimales¶
def format_decimals(number):
return f"{number:,.2f}".rstrip('0').rstrip('.')
Clase Regresión Lineal Customizada¶
class CustomLinearRegression():
def __init__(self, X, y, title, is_interactive = True):
self.X = X
self.y = y
self.title = title
self.model = LinearRegression()
self.coef_df: pd.DataFrame = pd.DataFrame()
self.performance_metrics = None
self.is_interactive = is_interactive
def run(self):
print("--> Iniciando la division del dataset")
X_train, X_test, y_train, y_test = self.split_data()
print("-" * 40)
print("--> Iniciando el entrenamiento del modelo")
self.fit(X_train, y_train)
print("-" * 40)
print("--> Iniciando la predicción del modelo")
y_pred_test = self.predict(X_test)
print("-" * 40)
print("--> Iniciando la evaluación del modelo")
self.evaluate_model(y_test, y_pred_test)
print("-" * 40)
print("--> Iniciando la creación del dataframe de coeficientes")
self.coefficients_per_variable(X_train)
print("-" * 40)
print("--> Prediciendo sobre entrenamiento y prueba")
y_train_pred, y_test_pred = self.test_prediction(X_train, X_test)
print("-" * 40)
if self.is_interactive:
print("--> Graficando comparación interactiva del modelo")
self.plot_comparison_interactive(y_train, y_test, y_train_pred, y_test_pred)
print("-" * 40)
else:
print("--> Graficando comparación del modelo")
self.plot_comparison(y_train, y_test, y_train_pred, y_test_pred)
print("-" * 40)
if self.is_interactive:
print("--> Graficando residuos interactivos")
self.plot_residuals_interactive(X_test, y_test)
print("-" * 40)
else:
print("--> Graficando residuos")
self.plot_residuals(X_test, y_test)
print("-" * 40)
if self.is_interactive:
print("--> Graficando importance de variables interactiva")
self.plot_feature_importance_interactive()
print("-" * 40)
else:
print("--> Graficando importance de variables")
self.plot_feature_importance()
print("-" * 40)
def split_data(self):
X_train, X_test, y_train, y_test = train_test_split(self.X, self.y, test_size=0.2, random_state=42)
print(f"\tTamaño del dataset: {len(self.X)}")
print(f"\tTamaño del dataset de entrenamiento: {len(X_train)}")
print(f"\tTamaño del dataset de prueba: {len(X_test)}")
return X_train, X_test, y_train, y_test
def fit(self, X_train, y_train):
self.model.fit(X_train, y_train)
def predict(self, X):
return self.model.predict(X)
def evaluate_model(self, y_true, y_pred):
mae = mean_absolute_error(y_true, y_pred)
mse = mean_squared_error(y_true, y_pred)
r2 = r2_score(y_true, y_pred)
print(f"\tError absoluto medio (MAE): {mae:.2f}")
print(f"\tError cuadrático medio (MSE): {mse:.2f}")
print(f"\tCoeficiente de determinación (R²): {r2:.2f}")
self.performance_metrics = pd.DataFrame({
'Total Features': [self.X.shape[1]],
'MAE': [mae],
'MSE': [mse],
'R2': [r2]
})
def coefficients_per_variable(self, X_train):
if hasattr(X_train, 'columns'):
variable_names = X_train.columns
else:
n_components = X_train.shape[1]
variable_names = [f"PC{i+1}" for i in range(n_components)]
self.coef_df = pd.DataFrame({
'Variable': variable_names,
'Coeficiente': self.model.coef_
}).sort_values(by='Coeficiente', ascending=False).head(10)
def test_prediction(self, X_train, X_test):
return self.predict(X_train), self.predict(X_test)
def plot_comparison(self, y_train, y_test, y_train_pred, y_test_pred):
plt.figure(figsize=(10, 6))
plt.scatter(y_train, y_train_pred, color='blue', alpha=0.5, label='Entrenamiento')
plt.scatter(y_test, y_test_pred, color='green', alpha=0.5, label='Prueba')
plt.plot([self.y.min(), self.y.max()], [self.y.min(), self.y.max()], color='red', linestyle='--', label='Ideal')
plt.xlabel('IMDb Score Real')
plt.ylabel('IMDb Score Predicho')
plt.title('Comparación de Predicciones: Entrenamiento vs Prueba')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
def plot_comparison_interactive(self, y_train, y_test, y_train_pred, y_test_pred):
fig = go.Figure()
fig.add_trace(go.Scatter(
x=y_train, y=y_train_pred,
mode='markers',
name='Entrenamiento',
marker=dict(color='blue', opacity=0.5),
hovertemplate='Real: %{x}<br>Predicho: %{y}'
))
fig.add_trace(go.Scatter(
x=y_test, y=y_test_pred,
mode='markers',
name='Prueba',
marker=dict(color='green', opacity=0.5),
hovertemplate='Real: %{x}<br>Predicho: %{y}'
))
fig.add_trace(go.Scatter(
x=[min(y_train.min(), y_test.min()), max(y_train.max(), y_test.max())],
y=[min(y_train.min(), y_test.min()), max(y_train.max(), y_test.max())],
mode='lines',
name='Ideal',
line=dict(color='red', dash='dash')
))
fig.update_layout(
title='Comparación de Predicciones: Entrenamiento vs Prueba',
xaxis_title='IMDb Score Real',
yaxis_title='IMDb Score Predicho',
template='plotly_white',
)
fig.show()
fig_title = f"{self.title}_plot_comparison_interactive.html"
fig.write_html(f"assets/{fig_title}")
print(f"\tGráfico guardado en assets/{fig_title}")
def plot_residuals(self, X_test, y_test):
y_pred = self.predict(X_test)
residuals = y_test - y_pred
plt.figure(figsize=(10, 5))
plt.scatter(y_pred, residuals, alpha=0.6, color='purple')
plt.axhline(y=0, color='red', linestyle='--')
plt.xlabel('Predicción')
plt.ylabel('Residuo (Real - Predicción)')
plt.title('Gráfico de Residuos')
plt.grid(True)
plt.tight_layout()
plt.show()
def plot_residuals_interactive(self, X_test, y_test):
y_pred = self.predict(X_test)
residuals = y_test - y_pred
fig = go.Figure()
fig.add_trace(go.Scatter(
x=y_pred, y=residuals,
mode='markers',
marker=dict(color='purple', opacity=0.6),
hovertemplate='Predicción: %{x}<br>Residuo: %{y}'
))
fig.add_trace(go.Scatter(
x=[min(y_pred), max(y_pred)],
y=[0, 0],
mode='lines',
line=dict(color='red', dash='dash'),
name='Residuo = 0'
))
fig.update_layout(
title='Gráfico de Residuos',
xaxis_title='Predicción',
yaxis_title='Residuo (Real - Predicción)',
template='plotly_white'
)
fig.show()
fig_title = f"{self.title}_plot_residuals_interactive.html"
fig.write_html(f"assets/{fig_title}")
print(f"\tGráfico guardado en assets/{fig_title}")
def plot_feature_importance(self, top_n=10):
if self.coef_df is None:
print("ERROR: Primero ejecuta `.run()` para calcular los coeficientes.")
return
df = self.coef_df.head(top_n).sort_values(by='Coeficiente')
plt.figure(figsize=(10, 6))
plt.barh(df['Variable'], df['Coeficiente'], color='teal')
plt.xlabel('Coeficiente')
plt.title(f'Top {top_n} Variables más Influyentes')
plt.grid(True, axis='x')
plt.tight_layout()
plt.show()
def plot_feature_importance_interactive(self, top_n=10):
df = self.coef_df.head(top_n).sort_values(by='Coeficiente')
fig = go.Figure(go.Bar(
x=df['Coeficiente'],
y=df['Variable'],
orientation='h',
marker_color='teal',
hovertemplate='Variable: %{y}<br>Coeficiente: %{x}'
))
fig.update_layout(
title=f'Top {top_n} Variables más Influyentes',
xaxis_title='Coeficiente',
template='plotly_white'
)
fig.show()
fig_title = f"{self.title}_plot_feature_importance_interactive.html"
fig.write_html(f"assets/{fig_title}")
print(f"\tGráfico guardado en assets/{fig_title}")
def summary(self):
print("\nRESUMEN DEL MODELO")
print("-" * 40)
if self.performance_metrics is not None:
print("--> Métricas de desempeño:")
display(self.performance_metrics)
else:
print("ERROR: No se han calculado métricas. Ejecuta `.run()` primero.")
if self.coef_df is not None:
print("--> Principales coeficientes:")
display(self.coef_df)
else:
print("ERROR: No se han generado coeficientes aún.")
Clase EDA Visualizer helpers¶
class EDAVisualizerHelpers:
@staticmethod
def _assert_cols(dataframe: pd.DataFrame, columns: Sequence[str]) -> None:
missing = [column for column in columns if column not in dataframe.columns]
if missing:
raise ValueError(f"Columnas no encontradas en el DataFrame: {missing}")
@staticmethod
def _slugify(text: str) -> str:
text = re.sub(r"[^\w\s-]", "", text, flags=re.UNICODE)
text = re.sub(r"\s+", "_", text.strip())
return text
@staticmethod
def _ensure_dir(path: str) -> None:
os.makedirs(path, exist_ok=True)
@staticmethod
def _save_plotly_html(fig: go.Figure, title: str, folder: str = "assets") -> str:
EDAVisualizerHelpers._ensure_dir(folder)
fname = f"{EDAVisualizerHelpers._slugify(title)}.html"
out = os.path.join(folder, fname)
fig.write_html(out)
return out
Clase EDA Visualizer Estático¶
class EDAVisualizerStatic:
@staticmethod
def plot_boxplot(
title: str,
data: pd.DataFrame,
x: str, y: str,
x_label: Optional[str] = None, y_label: Optional[str] = None,
figsize : Tuple[int, int] = (20, 10),
rotation: int = 90,
y_range: Optional[Tuple[float, float]] = None,
grid: bool = True, show: bool = True
):
if x_label is None: x_label = x
if y_label is None: y_label = y
plt.figure(figsize=figsize)
sns.boxplot(data=data, x=x, y=y)
plt.xticks(rotation=rotation)
plt.title(title)
plt.xlabel(x_label)
plt.ylabel(y_label)
if y_range is not None: plt.ylim(*y_range)
if grid: plt.grid(axis='y', linestyle='--', alpha=0.9)
if show: plt.show()
@staticmethod
def plot_histogram(
title: str,
data: pd.DataFrame,
column: str,
x_label: Optional[str] = None, y_label: Optional[str] = "Frecuencia",
bins: int = 30, kde: bool = True,
figsize: Tuple[int, int] = (8, 4),
x_range: Optional[Tuple[float, float]] = None,
grid: bool = True, show: bool = True,
):
EDAVisualizerHelpers._assert_cols(data, [column])
if not np.issubdtype(data[column].dropna().dtype, np.number): # type: ignore
raise TypeError(f"La columna '{column}' debe ser numérica para histograma.")
x_label = column if x_label is None else x_label
plt.figure(figsize=figsize)
sns.histplot(data=data, x=column, bins=bins, kde=kde)
plt.title(title)
plt.xlabel(x_label)
plt.ylabel(y_label if y_label else "Frecuencia")
if x_range is not None: plt.xlim(*x_range)
if grid: plt.grid(linestyle="--", alpha=0.9)
if show: plt.show()
@staticmethod
def plot_categorical_counts_grid(
data: pd.DataFrame,
categorical_cols: Sequence[str],
n_cols: int = 2,
top_n: int = 8,
figsize: Optional[Tuple[int, int]] = None,
rotate_xticks: int = 45,
show: bool = True,
):
"""
Malla de countplots con top-N cuando hay demasiadas categorías.
"""
EDAVisualizerHelpers._assert_cols(data, list(categorical_cols))
n = n_cols
n_rows = math.ceil(len(categorical_cols) / n)
if figsize is None:
figsize = (15, 7 * n_rows)
plt.figure(figsize=figsize)
for i, col in enumerate(categorical_cols):
plt.subplot(n_rows, n, i + 1)
series = data[col].astype("string")
n_unique = series.nunique(dropna=True)
if n_unique <= top_n:
sns.countplot(data=data, x=col)
plt.title(col)
else:
top_categories = series.value_counts().nlargest(top_n).index
sns.countplot(data=data[data[col].isin(top_categories)], x=col)
plt.title(f"{col} (Top {top_n} categorías)")
plt.xticks(rotation=rotate_xticks)
plt.xlabel(col)
plt.ylabel("Frecuencia")
plt.tight_layout()
plt.subplots_adjust(hspace=0.5, wspace=0.3)
if show: plt.show()
@staticmethod
def plot_pairplot(dataset: pd.DataFrame, title: str):
g = sns.pairplot(dataset)
plt.title(title)
g.map_upper(sns.kdeplot, levels=4, color=".2")
plt.show()
@staticmethod
def plot_heatmap(correlation_matrix: pd.DataFrame, figsize: Optional[Tuple[int, int]] = None,):
fig, ax = plt.subplots(figsize=figsize)
sns.heatmap(correlation_matrix, annot=True, fmt=".2f", annot_kws={'size': 16})
@staticmethod
def plot_3d_projection(
dataset: pd.DataFrame,
column: str,
title: str,
x_label: str,
y_label: str,
z_label: str,
cbar_label: str,
figsize: Optional[Tuple[int, int]] = None,
):
fig = plt.figure(figsize=figsize)
ax = fig.add_subplot(111, projection='3d')
scatter = ax.scatter(x_pca[:, 0], x_pca[:, 1], x_pca[:, 2], c=dataset[column], cmap='viridis', s=40) # type: ignore
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)
ax.set_zlabel(z_label)
ax.set_title(title)
cbar = plt.colorbar(scatter, ax=ax, shrink=0.5, aspect=10)
cbar.set_label(cbar_label)
plt.tight_layout()
plt.show()
Clase EDA Visualizer Interactivo¶
class EDAVisualizerInteractive:
@staticmethod
def plot_boxplot(
title: str,
data: pd.DataFrame,
x: str, y: str,
x_label: Optional[str] = None,
y_label: Optional[str] = None,
y_range: Optional[Tuple[float, float]] = None,
template: str = "plotly_white",
height: int = 750, width: int = 1350,
show: bool = True,
save_html: bool = True,
save_folder: str = "assets",
points: str = "all", # 'all' | 'outliers' | False
hover_extra_cols: Optional[Sequence[str]] = None,
):
EDAVisualizerHelpers._assert_cols(data, [x, y])
if hover_extra_cols:
EDAVisualizerHelpers._assert_cols(data, list(hover_extra_cols))
x_label = x if x_label is None else x_label
y_label = y if y_label is None else y_label
fig = px.box(
data_frame=data,
x=x, y=y,
points=points,
hover_data=list(hover_extra_cols) if hover_extra_cols else None,
title=title,
)
fig.update_layout(
xaxis_title=x_label, yaxis_title=y_label,
xaxis_tickangle=-90,
template=template,
height=height, width=width,
)
if y_range is not None: fig.update_yaxes(range=list(y_range))
if show: fig.show()
out_path = None
if save_html:
out_path = EDAVisualizerHelpers._save_plotly_html(fig, title, folder=save_folder)
print(f"Boxplot guardado en {out_path}")
return out_path
@staticmethod
def plot_histogram_by_category(
title: str,
subtitle: str,
data: pd.DataFrame,
category_col: str,
value_col: str,
categories: Optional[Sequence[str]] = None,
nbins: int = 30,
template: str = "plotly_white",
x_label: Optional[str] = None,
y_label: Optional[str] = "Frecuencia",
show: bool = True,
save_html: bool = True,
save_folder: str = "assets",
height: int = 750,
width: int = 1350,
):
EDAVisualizerHelpers._assert_cols(data, [category_col, value_col])
# Validar numérico
if not np.issubdtype(data[value_col].dropna().dtype, np.number): # type: ignore
raise TypeError(f"'{value_col}' debe ser numérica para histograma.")
cat_values = categories or list(map(str, sorted(data[category_col].dropna().unique())))
cat_values = list(cat_values) # asegurar indexable
fig = go.Figure()
for i, cat in enumerate(cat_values):
subset = data[data[category_col] == cat]
fig.add_trace(
go.Histogram(
x=subset[value_col],
name=str(cat),
visible=(i == 0),
nbinsx=nbins,
opacity=0.75,
)
)
buttons = []
for i, cat in enumerate(cat_values):
visible_mask = [j == i for j in range(len(cat_values))]
buttons.append(
dict(
label=str(cat),
method="update",
args=[
{"visible": visible_mask},
{
"title": f"{subtitle} {cat}",
"xaxis": {"title": x_label or value_col},
"yaxis": {"title": y_label or "Frecuencia"},
},
],
)
)
fig.update_layout(
updatemenus=[dict(active=0, buttons=buttons, x=1.15, y=1.15)],
title=title if cat_values == [] else f"{subtitle} {cat_values[0]}",
xaxis_title=x_label or value_col,
yaxis_title=y_label or "Frecuencia",
template=template,
height=height,
width=width,
bargap=0.1,
)
if show: fig.show()
out_path = None
if save_html:
out_path = EDAVisualizerHelpers._save_plotly_html(fig, title, folder=save_folder)
print(f"Histograma guardado en {out_path}")
return out_path
@staticmethod
def plot_categorical_counts_dropdown(
title: str,
data: pd.DataFrame,
categorical_cols: Sequence[str],
top_n: int = 8,
template: str = "plotly_white",
height: int = 750,
width: int = 1350,
show: bool = True,
save_html: bool = True,
save_folder: str = "assets",
) -> Optional[str]:
"""
Un solo gráfico de barras Plotly con dropdown para alternar columnas categóricas.
Limita a Top-N categorías por columna cuando sea necesario.
"""
EDAVisualizerHelpers._assert_cols(data, list(categorical_cols))
fig = go.Figure()
# Preparar trazas (una por columna categórica)
for i, col in enumerate(categorical_cols):
series = data[col].astype("string")
if series.nunique(dropna=True) <= top_n:
counts = series.value_counts(dropna=False).sort_values(ascending=False)
else:
top_categories = series.value_counts().nlargest(top_n).index
counts = series[series.isin(top_categories)].value_counts().sort_values(ascending=False)
fig.add_trace(
go.Bar(
x=counts.index.astype(str),
y=counts.values,
name=str(col),
visible=(i == 0),
)
)
# Dropdown de visibilidad
buttons = []
for i, col in enumerate(categorical_cols):
visible_mask = [j == i for j in range(len(categorical_cols))]
buttons.append(
dict(
label=str(col),
method="update",
args=[
{"visible": visible_mask},
{
"title": f"Frecuencia de categorías en {col}",
"xaxis": {"title": str(col)},
"yaxis": {"title": "Frecuencia"},
},
],
)
)
first = str(categorical_cols[0])
fig.update_layout(
updatemenus=[dict(active=0, buttons=buttons, x=1.15, y=1.15)],
title=f"Frecuencia de categorías en {first}",
xaxis_title=first,
yaxis_title="Frecuencia",
template=template,
height=height,
width=width,
)
if show: fig.show()
out_path = None
if save_html:
out_path = EDAVisualizerHelpers._save_plotly_html(fig, title, folder=save_folder)
print(f"Gráfico guardado en {out_path}")
return out_path
@staticmethod
def plot_numerical_hists_dropdown(
title: str,
data: pd.DataFrame,
numerical_cols: Sequence[str],
nbins: int = 30,
template: str = "plotly_white",
height: int = 750,
width: int = 1350,
show: bool = True,
save_html: bool = True,
save_folder: str = "assets",
) -> Optional[str]:
"""
Un solo histograma Plotly con dropdown para alternar columnas numéricas.
"""
EDAVisualizerHelpers._assert_cols(data, list(numerical_cols))
fig = go.Figure()
for i, col in enumerate(numerical_cols):
if not np.issubdtype(data[col].dropna().dtype, np.number): # type: ignore
raise TypeError(f"'{col}' debe ser numérica para histograma.")
fig.add_trace(
go.Histogram(
x=data[col],
name=str(col),
nbinsx=nbins,
opacity=0.75,
visible=(i == 0),
)
)
buttons = []
for i, col in enumerate(numerical_cols):
visible_mask = [j == i for j in range(len(numerical_cols))]
buttons.append(
dict(
label=str(col),
method="update",
args=[
{"visible": visible_mask},
{
"title": f"Distribución de {col}",
"xaxis": {"title": str(col)},
"yaxis": {"title": "Frecuencia"},
},
],
)
)
first = str(numerical_cols[0])
fig.update_layout(
updatemenus=[dict(active=0, buttons=buttons, x=1.15, y=1.15)],
title=f"Distribución de {first}",
xaxis_title=first,
yaxis_title="Frecuencia",
template=template,
height=height,
width=width,
bargap=0.1,
)
if show: fig.show()
out_path = None
if save_html:
out_path = EDAVisualizerHelpers._save_plotly_html(fig, title, folder=save_folder)
print(f"Gráfico guardado en {out_path}")
return out_path
@staticmethod
def plot_pairplot(
dataset: pd.DataFrame,
title: str,
height: int = 1000,
width: int = 1000,
show: bool = True,
save_html: bool = True,
save_folder: str = "assets",
):
fig = px.scatter_matrix(
dataset,
dimensions=dataset.select_dtypes(include='number').columns,
color='imdb_score', # opcional si quieres colorear por variable objetivo
title=title,
height=height,
width=width
)
fig.update_traces(diagonal_visible=False) # oculta histogramas diagonales si prefieres
fig.update_layout(template='plotly_white')
if show: fig.show()
out_path = None
if save_html:
out_path = EDAVisualizerHelpers._save_plotly_html(fig, title, folder=save_folder)
print(f"Gráfico guardado en {out_path}")
return out_path
@staticmethod
def plot_heatmap(
correlation_matrix: pd.DataFrame,
title: str,
height: int = 1350,
width: int = 1350,
font_size: int = 12,
show: bool = True,
save_html: bool = True,
save_folder: str = "assets",
):
corr_long = correlation_matrix.reset_index().melt(id_vars='index')
corr_long.columns = ['Variable 1', 'Variable 2', 'Correlación']
fig = px.imshow(
correlation_matrix,
text_auto=True,
color_continuous_scale='RdBu_r',
zmin=-1, zmax=1,
title=title
)
fig.update_layout(
title_font_size=25,
xaxis_title='Variables',
yaxis_title='Variables',
xaxis=dict(tickfont=dict(size=font_size)),
yaxis=dict(tickfont=dict(size=font_size)),
template='plotly_white',
height=height,
width=width,
)
fig.update_traces(textfont_size=font_size)
if show: fig.show()
out_path = None
if save_html:
out_path = EDAVisualizerHelpers._save_plotly_html(fig, title, folder=save_folder)
print(f"Gráfico guardado en {out_path}")
return out_path
@staticmethod
def plot_3d_projection(
dataset: pd.DataFrame,
title: str,
x_label: str,
y_label: str,
z_label: str,
label: str,
show: bool = True,
save_html: bool = True,
save_folder: str = "assets",
):
fig = px.scatter_3d(
dataset,
x=x_label, y=y_label, z=z_label,
color=label,
color_continuous_scale='Viridis',
title=title,
labels={label: label}
)
fig.update_traces(marker=dict(size=3))
if show: fig.show()
out_path = None
if save_html:
out_path = EDAVisualizerHelpers._save_plotly_html(fig, title, folder=save_folder)
print(f"Gráfico guardado en {out_path}")
return out_path
Carga del dataset¶
file_name = "movie_metadata.csv"
dataset = pd.read_csv(file_name)
Descripción del Dataset¶
Cantidad de registros y número de columnas
dataset.shape
(5043, 28)
Inferencia del dataset
print("\033[1mInference:\033[0m El dataset consiste de {} features y {} ejemplos".format(dataset.shape[1], dataset.shape[0]))
Inference: El dataset consiste de 28 features y 5043 ejemplos
Columnas del dataset
dataset.keys()
Index(['color', 'director_name', 'num_critic_for_reviews', 'duration',
'director_facebook_likes', 'actor_3_facebook_likes', 'actor_2_name',
'actor_1_facebook_likes', 'gross', 'genres', 'actor_1_name',
'movie_title', 'num_voted_users', 'cast_total_facebook_likes',
'actor_3_name', 'facenumber_in_poster', 'plot_keywords',
'movie_imdb_link', 'num_user_for_reviews', 'language', 'country',
'content_rating', 'budget', 'title_year', 'actor_2_facebook_likes',
'imdb_score', 'aspect_ratio', 'movie_facebook_likes'],
dtype='object')
Ejemplos de los primeros 5 y últimos 5 registros del dataset
dataset.head()
| color | director_name | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_2_name | actor_1_facebook_likes | gross | genres | ... | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Color | James Cameron | 723.0 | 178.0 | 0.0 | 855.0 | Joel David Moore | 1000.0 | 760505847.0 | Action|Adventure|Fantasy|Sci-Fi | ... | 3054.0 | English | USA | PG-13 | 237000000.0 | 2009.0 | 936.0 | 7.9 | 1.78 | 33000 |
| 1 | Color | Gore Verbinski | 302.0 | 169.0 | 563.0 | 1000.0 | Orlando Bloom | 40000.0 | 309404152.0 | Action|Adventure|Fantasy | ... | 1238.0 | English | USA | PG-13 | 300000000.0 | 2007.0 | 5000.0 | 7.1 | 2.35 | 0 |
| 2 | Color | Sam Mendes | 602.0 | 148.0 | 0.0 | 161.0 | Rory Kinnear | 11000.0 | 200074175.0 | Action|Adventure|Thriller | ... | 994.0 | English | UK | PG-13 | 245000000.0 | 2015.0 | 393.0 | 6.8 | 2.35 | 85000 |
| 3 | Color | Christopher Nolan | 813.0 | 164.0 | 22000.0 | 23000.0 | Christian Bale | 27000.0 | 448130642.0 | Action|Thriller | ... | 2701.0 | English | USA | PG-13 | 250000000.0 | 2012.0 | 23000.0 | 8.5 | 2.35 | 164000 |
| 4 | NaN | Doug Walker | NaN | NaN | 131.0 | NaN | Rob Walker | 131.0 | NaN | Documentary | ... | NaN | NaN | NaN | NaN | NaN | NaN | 12.0 | 7.1 | NaN | 0 |
5 rows × 28 columns
dataset.tail()
| color | director_name | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_2_name | actor_1_facebook_likes | gross | genres | ... | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5038 | Color | Scott Smith | 1.0 | 87.0 | 2.0 | 318.0 | Daphne Zuniga | 637.0 | NaN | Comedy|Drama | ... | 6.0 | English | Canada | NaN | NaN | 2013.0 | 470.0 | 7.7 | NaN | 84 |
| 5039 | Color | NaN | 43.0 | 43.0 | NaN | 319.0 | Valorie Curry | 841.0 | NaN | Crime|Drama|Mystery|Thriller | ... | 359.0 | English | USA | TV-14 | NaN | NaN | 593.0 | 7.5 | 16.00 | 32000 |
| 5040 | Color | Benjamin Roberds | 13.0 | 76.0 | 0.0 | 0.0 | Maxwell Moody | 0.0 | NaN | Drama|Horror|Thriller | ... | 3.0 | English | USA | NaN | 1400.0 | 2013.0 | 0.0 | 6.3 | NaN | 16 |
| 5041 | Color | Daniel Hsia | 14.0 | 100.0 | 0.0 | 489.0 | Daniel Henney | 946.0 | 10443.0 | Comedy|Drama|Romance | ... | 9.0 | English | USA | PG-13 | NaN | 2012.0 | 719.0 | 6.3 | 2.35 | 660 |
| 5042 | Color | Jon Gunn | 43.0 | 90.0 | 16.0 | 16.0 | Brian Herzlinger | 86.0 | 85222.0 | Documentary | ... | 84.0 | English | USA | PG | 1100.0 | 2004.0 | 23.0 | 6.6 | 1.85 | 456 |
5 rows × 28 columns
Validar el número de filas únicas en cada feature
dataset.nunique().sort_values()
color 2 content_rating 18 facenumber_in_poster 19 aspect_ratio 22 language 46 country 65 imdb_score 78 title_year 91 duration 191 director_facebook_likes 435 budget 439 num_critic_for_reviews 528 movie_facebook_likes 876 actor_1_facebook_likes 878 actor_3_facebook_likes 906 genres 914 actor_2_facebook_likes 917 num_user_for_reviews 954 actor_1_name 2097 director_name 2398 actor_2_name 3032 actor_3_name 3521 cast_total_facebook_likes 3978 gross 4035 plot_keywords 4760 num_voted_users 4826 movie_title 4917 movie_imdb_link 4919 dtype: int64
Listado de valores únicos por columna
for key in dataset.keys():
unique = dataset[key].unique()
print(f"\033[1m{key}:\033[0m")
print(f"\t- Total de datos únicos: {len(unique)}")
print(f"\t- Valores: {unique}\n")
color: - Total de datos únicos: 3 - Valores: ['Color' nan ' Black and White'] director_name: - Total de datos únicos: 2399 - Valores: ['James Cameron' 'Gore Verbinski' 'Sam Mendes' ... 'Scott Smith' 'Benjamin Roberds' 'Daniel Hsia'] num_critic_for_reviews: - Total de datos únicos: 529 - Valores: [723. 302. 602. 813. nan 462. 392. 324. 635. 375. 673. 434. 403. 313. 450. 733. 258. 703. 448. 451. 422. 599. 343. 509. 251. 446. 315. 516. 377. 644. 750. 300. 608. 334. 376. 366. 378. 525. 495. 469. 304. 436. 453. 424. 654. 539. 590. 338. 490. 306. 575. 428. 470. 298. 488. 322. 421. 162. 367. 240. 384. 248. 284. 396. 645. 408. 219. 486. 682. 85. 264. 418. 186. 585. 91. 250. 536. 370. 416. 401. 521. 10. 218. 576. 226. 443. 188. 286. 288. 280. 653. 712. 642. 1. 187. 362. 500. 389. 235. 231. 227. 275. 474. 228. 191. 329. 295. 318. 323. 276. 478. 167. 185. 350. 245. 406. 739. 225. 145. 310. 526. 465. 357. 194. 339. 132. 135. 256. 196. 220. 211. 464. 208. 287. 210. 432. 190. 314. 518. 291. 292. 184. 141. 267. 351. 163. 166. 510. 197. 244. 156. 354. 21. 252. 556. 153. 266. 517. 502. 165. 94. 246. 330. 440. 274. 349. 154. 233. 271. 4. 294. 159. 289. 342. 382. 344. 183. 175. 239. 237. 262. 552. 102. 775. 71. 476. 207. 492. 168. 283. 359. 320. 257. 33. 152. 348. 738. 93. 181. 369. 179. 358. 160. 192. 198. 263. 447. 29. 172. 104. 327. 125. 79. 326. 297. 174. 109. 101. 568. 62. 265. 232. 400. 230. 180. 81. 765. 80. 383. 193. 170. 333. 203. 321. 606. 144. 511. 212. 127. 78. 66. 97. 202. 136. 169. 200. 255. 173. 221. 82. 308. 301. 328. 199. 355. 529. 412. 106. 61. 217. 316. 352. 143. 148. 415. 146. 70. 269. 253. 281. 122. 157. 64. 142. 84. 201. 47. 114. 206. 222. 103. 236. 238. 107. 459. 151. 229. 158. 98. 393. 149. 138. 345. 120. 234. 134. 139. 155. 204. 95. 215. 325. 53. 46. 147. 178. 209. 19. 31. 129. 124. 35. 137. 121. 87. 63. 113. 205. 123. 272. 3. 140. 150. 119. 49. 177. 372. 290. 164. 12. 241. 161. 89. 131. 67. 130. 74. 435. 117. 108. 176. 299. 128. 88. 261. 73. 75. 39. 247. 59. 388. 371. 76. 105. 242. 360. 112. 189. 92. 51. 293. 40. 90. 538. 307. 72. 86. 279. 96. 14. 60. 361. 42. 68. 596. 460. 249. 213. 118. 77. 171. 387. 110. 83. 34. 8. 223. 111. 100. 115. 54. 579. 56. 20. 57. 133. 26. 491. 55. 45. 224. 50. 44. 277. 391. 216. 558. 413. 457. 65. 116. 419. 2. 36. 99. 259. 356. 22. 214. 28. 296. 30. 48. 195. 32. 43. 454. 398. 38. 25. 23. 24. 656. 270. 27. 9. 433. 319. 41. 374. 341. 16. 420. 303. 260. 335. 273. 37. 546. 437. 126. 5. 340. 493. 332. 405. 285. 13. 584. 58. 52. 254. 522. 441. 15. 18. 449. 472. 268. 452. 589. 487. 305. 397. 69. 634. 417. 368. 7. 17. 426. 309. 373. 317. 336. 365. 445. 574. 394. 6. 423. 466. 11. 549. 597. 364. 282. 427. 390. 182. 588. 543. 479. 676. 278. 414. 331. 669. 489. 399. 385. 363. 410. 535. 386. 439. 346. 534. 411. 471. 444. 548. 425. 337. 533. 311. 663. 481. 409.] duration: - Total de datos únicos: 192 - Valores: [178. 169. 148. 164. nan 132. 156. 100. 141. 153. 183. 106. 151. 150. 143. 173. 136. 186. 113. 201. 194. 147. 131. 124. 135. 195. 108. 104. 165. 130. 142. 125. 123. 103. 118. 140. 149. 114. 116. 154. 122. 93. 98. 91. 158. 96. 127. 110. 144. 152. 94. 126. 112. 176. 95. 97. 109. 128. 102. 101. 120. 121. 182. 166. 137. 184. 206. 138. 157. 115. 111. 89. 105. 119. 129. 146. 88. 99. 90. 85. 92. 196. 133. 215. 60. 117. 107. 82. 159. 174. 134. 77. 170. 76. 171. 84. 22. 145. 78. 240. 172. 87. 216. 192. 44. 83. 139. 86. 162. 54. 80. 25. 74. 81. 177. 73. 43. 45. 163. 30. 212. 187. 189. 188. 280. 155. 64. 190. 75. 220. 160. 52. 325. 251. 202. 330. 289. 161. 28. 79. 63. 511. 42. 167. 193. 175. 185. 219. 7. 271. 50. 72. 24. 68. 225. 236. 180. 334. 270. 227. 286. 65. 55. 41. 69. 293. 200. 40. 168. 197. 181. 300. 23. 53. 46. 67. 199. 226. 37. 11. 66. 34. 20. 27. 70. 14. 71. 58. 35. 59. 62. 47.] director_facebook_likes: - Total de datos únicos: 436 - Valores: [0.00e+00 5.63e+02 2.20e+04 1.31e+02 4.75e+02 1.50e+01 2.82e+02 3.95e+02 8.00e+01 2.52e+02 1.88e+02 4.64e+02 1.29e+02 9.40e+01 5.32e+02 3.65e+02 1.00e+03 1.30e+04 4.20e+02 3.70e+01 3.64e+02 4.87e+02 2.58e+02 1.25e+02 3.68e+02 1.40e+04 1.79e+02 1.13e+02 5.60e+01 6.81e+02 7.76e+02 1.10e+01 4.00e+03 1.70e+04 3.57e+02 4.52e+02 2.93e+02 2.18e+02 5.80e+01 2.08e+02 2.74e+02 1.71e+02 1.98e+02 5.96e+02 4.70e+01 3.10e+01 6.63e+02 3.80e+01 6.60e+01 2.55e+02 8.40e+01 5.71e+02 2.80e+01 2.10e+04 9.05e+02 5.08e+02 2.26e+02 2.49e+02 3.30e+01 5.00e+01 2.30e+02 1.50e+02 3.50e+01 1.89e+02 1.51e+02 6.90e+01 7.50e+02 2.00e+03 5.90e+01 1.20e+01 4.73e+02 3.94e+02 9.00e+01 2.50e+01 4.20e+01 4.56e+02 9.30e+01 1.76e+02 5.00e+00 5.20e+01 2.30e+01 3.80e+02 2.95e+02 5.03e+02 2.09e+02 6.00e+00 6.08e+02 3.86e+02 nan 1.30e+01 5.21e+02 5.40e+01 2.35e+02 9.60e+01 1.24e+02 1.07e+02 7.19e+02 3.23e+02 5.41e+02 6.10e+02 1.67e+02 1.60e+02 6.62e+02 1.23e+02 2.94e+02 4.46e+02 1.60e+01 1.90e+01 7.90e+01 1.28e+02 6.20e+01 5.50e+01 2.63e+02 6.70e+01 1.01e+02 1.53e+02 3.40e+01 6.30e+01 5.70e+01 1.20e+04 2.85e+02 1.60e+04 2.10e+01 1.00e+01 1.65e+02 1.40e+01 7.70e+01 2.07e+02 6.70e+02 2.60e+01 3.85e+02 2.00e+01 3.42e+02 6.11e+02 9.00e+00 1.16e+02 1.27e+02 4.40e+01 8.10e+01 7.00e+01 2.12e+02 1.02e+02 9.70e+01 7.00e+00 3.35e+02 2.21e+02 8.70e+01 4.68e+02 3.78e+02 5.45e+02 2.66e+02 3.60e+01 2.78e+02 1.68e+02 9.90e+01 7.63e+02 8.80e+01 4.80e+02 7.50e+01 9.10e+01 1.63e+02 1.54e+02 3.33e+02 1.17e+02 3.00e+01 3.01e+02 4.25e+02 4.00e+01 4.38e+02 6.50e+01 9.20e+01 4.30e+01 6.40e+01 2.87e+02 1.80e+01 3.09e+02 4.50e+01 2.75e+02 8.45e+02 1.26e+02 2.70e+01 2.72e+02 1.09e+02 7.20e+01 1.70e+01 3.83e+02 4.10e+01 2.53e+02 2.20e+01 4.88e+02 1.30e+02 9.06e+02 2.40e+01 8.00e+00 1.05e+02 2.90e+01 3.00e+03 4.48e+02 4.00e+00 7.60e+01 7.59e+02 1.10e+04 1.19e+02 6.87e+02 1.38e+02 4.36e+02 1.75e+02 3.22e+02 7.08e+02 1.97e+02 4.80e+01 3.20e+01 2.34e+02 7.37e+02 1.81e+02 1.62e+02 2.00e+00 5.30e+01 3.90e+01 1.92e+02 8.92e+02 3.00e+00 1.43e+02 5.10e+01 8.30e+01 1.08e+02 6.10e+01 3.17e+02 8.20e+01 6.07e+02 1.59e+02 1.61e+02 4.22e+02 7.10e+01 8.69e+02 1.80e+02 2.77e+02 2.41e+02 1.55e+02 1.48e+02 1.52e+02 1.74e+02 2.13e+02 6.44e+02 7.30e+01 1.34e+02 2.60e+02 9.80e+01 6.28e+02 1.18e+02 3.75e+02 6.31e+02 2.70e+02 3.50e+02 7.77e+02 4.90e+01 8.50e+01 3.26e+02 1.70e+02 5.17e+02 8.35e+02 3.11e+02 8.90e+01 4.60e+01 3.08e+02 1.33e+02 7.80e+01 1.64e+02 4.15e+02 1.90e+02 6.43e+02 5.84e+02 5.34e+02 3.04e+02 8.83e+02 3.38e+02 2.51e+02 5.29e+02 4.53e+02 1.94e+02 6.00e+03 1.32e+02 1.36e+02 1.00e+02 2.38e+02 6.55e+02 7.29e+02 2.46e+02 3.43e+02 1.50e+04 5.49e+02 1.10e+02 1.14e+02 1.49e+02 4.19e+02 2.65e+02 1.15e+02 7.10e+02 7.67e+02 1.20e+02 1.44e+02 6.50e+02 3.53e+02 1.87e+02 2.48e+02 9.73e+02 6.80e+01 2.19e+02 1.21e+02 2.14e+02 4.60e+02 9.11e+02 5.35e+02 5.97e+02 2.69e+02 1.80e+04 9.56e+02 3.00e+02 6.88e+02 1.95e+02 4.06e+02 1.40e+02 1.22e+02 3.10e+02 2.10e+02 9.09e+02 3.29e+02 5.00e+02 3.79e+02 3.37e+02 1.57e+02 5.12e+02 8.47e+02 2.01e+02 1.37e+02 4.05e+02 3.69e+02 1.72e+02 9.29e+02 3.74e+02 7.45e+02 5.61e+02 4.54e+02 7.56e+02 1.41e+02 4.45e+02 2.32e+02 7.52e+02 4.12e+02 7.99e+02 1.39e+02 2.61e+02 2.20e+02 6.00e+01 1.66e+02 2.16e+02 3.19e+02 6.67e+02 4.82e+02 4.34e+02 3.87e+02 7.87e+02 3.24e+02 5.93e+02 3.02e+02 8.00e+02 1.47e+02 7.98e+02 9.30e+02 2.36e+02 4.07e+02 5.48e+02 3.41e+02 2.00e+02 3.46e+02 4.74e+02 5.54e+02 2.28e+02 4.72e+02 1.12e+02 7.35e+02 7.81e+02 2.30e+04 5.92e+02 2.22e+02 4.40e+02 3.99e+02 3.45e+02 5.20e+02 2.39e+02 2.04e+02 7.66e+02 3.77e+02 3.58e+02 4.50e+02 6.75e+02 3.30e+02 2.27e+02 1.77e+02 1.03e+02 1.04e+02 6.73e+02 9.64e+02 3.93e+02 7.00e+02 7.40e+01 3.55e+02 1.84e+02 4.21e+02 5.89e+02 6.03e+02 2.44e+02 5.31e+02 2.00e+04 1.91e+02 5.22e+02 7.64e+02 1.35e+02 1.99e+02 2.43e+02 9.23e+02 1.42e+02 2.24e+02 2.47e+02 9.50e+01 4.90e+02 2.17e+02 4.31e+02 3.73e+02 6.86e+02 6.64e+02 4.67e+02 9.69e+02 1.58e+02 3.97e+02 2.91e+02] actor_3_facebook_likes: - Total de datos únicos: 907 - Valores: [8.55e+02 1.00e+03 1.61e+02 2.30e+04 nan 5.30e+02 4.00e+03 2.84e+02 1.90e+04 1.00e+04 2.00e+03 9.03e+02 3.93e+02 7.48e+02 2.01e+02 7.18e+02 7.73e+02 9.63e+02 7.38e+02 8.40e+01 7.94e+02 1.10e+04 6.27e+02 3.00e+03 5.60e+02 7.60e+02 4.64e+02 8.08e+02 8.25e+02 7.76e+02 3.26e+02 7.21e+02 9.88e+02 1.40e+04 2.00e+04 9.28e+02 1.40e+02 7.70e+01 2.36e+02 9.19e+02 5.81e+02 1.13e+02 8.38e+02 1.05e+02 5.22e+02 1.73e+02 3.10e+02 1.30e+04 1.03e+02 8.20e+01 2.62e+02 4.59e+02 5.82e+02 5.95e+02 3.29e+02 7.00e+03 5.09e+02 6.00e+01 5.70e+02 3.84e+02 5.91e+02 8.46e+02 8.84e+02 2.83e+02 9.82e+02 2.13e+02 6.04e+02 5.62e+02 8.33e+02 2.67e+02 5.35e+02 7.59e+02 1.91e+02 6.00e+03 1.20e+01 3.70e+02 7.02e+02 6.92e+02 6.48e+02 5.90e+01 6.91e+02 6.87e+02 9.79e+02 5.58e+02 5.88e+02 9.54e+02 4.36e+02 2.33e+02 4.90e+02 3.00e+01 1.20e+04 9.43e+02 2.94e+02 6.99e+02 1.82e+02 5.02e+02 1.60e+04 6.41e+02 1.62e+02 8.26e+02 1.50e+01 3.46e+02 2.56e+02 4.33e+02 5.86e+02 3.94e+02 5.17e+02 8.44e+02 7.46e+02 3.22e+02 5.37e+02 9.64e+02 9.67e+02 6.90e+02 4.45e+02 9.81e+02 9.53e+02 6.53e+02 4.13e+02 9.10e+01 2.58e+02 9.34e+02 8.48e+02 4.22e+02 2.44e+02 1.68e+02 8.82e+02 1.84e+02 3.58e+02 7.33e+02 4.63e+02 9.39e+02 1.79e+02 8.83e+02 6.80e+02 5.23e+02 1.83e+02 8.07e+02 8.77e+02 3.97e+02 2.82e+02 8.00e+03 8.45e+02 1.65e+02 8.94e+02 3.88e+02 1.59e+02 6.45e+02 3.62e+02 5.60e+01 1.54e+02 8.50e+02 2.17e+02 2.41e+02 6.02e+02 4.09e+02 6.36e+02 8.12e+02 4.61e+02 3.41e+02 4.02e+02 2.65e+02 5.67e+02 8.70e+01 1.57e+02 9.29e+02 1.41e+02 6.17e+02 9.00e+03 4.29e+02 1.30e+01 2.68e+02 7.99e+02 7.80e+01 3.87e+02 3.50e+02 8.27e+02 1.16e+02 7.41e+02 4.32e+02 1.11e+02 4.21e+02 5.44e+02 4.00e+01 2.02e+02 4.30e+02 2.97e+02 8.57e+02 4.47e+02 7.80e+02 4.34e+02 4.66e+02 1.95e+02 5.25e+02 3.72e+02 6.95e+02 5.33e+02 8.34e+02 5.39e+02 5.61e+02 6.18e+02 3.85e+02 7.08e+02 1.07e+02 3.83e+02 5.42e+02 2.53e+02 2.03e+02 2.79e+02 1.20e+02 4.52e+02 4.23e+02 5.77e+02 5.66e+02 4.67e+02 7.20e+02 5.26e+02 2.27e+02 5.71e+02 6.83e+02 9.50e+01 3.60e+01 2.80e+01 9.70e+02 4.65e+02 5.57e+02 7.22e+02 4.42e+02 9.16e+02 4.16e+02 7.66e+02 2.40e+02 9.68e+02 5.27e+02 7.79e+02 3.30e+02 4.76e+02 5.50e+01 8.10e+02 5.68e+02 1.17e+02 4.84e+02 9.18e+02 5.85e+02 4.89e+02 7.27e+02 6.35e+02 5.03e+02 6.81e+02 4.41e+02 1.35e+02 2.49e+02 8.86e+02 3.07e+02 2.81e+02 9.57e+02 6.12e+02 1.23e+02 5.54e+02 5.99e+02 2.72e+02 9.71e+02 1.29e+02 9.36e+02 4.71e+02 5.00e+03 1.18e+02 1.48e+02 8.20e+02 4.39e+02 6.19e+02 6.24e+02 2.90e+01 2.10e+02 5.51e+02 1.72e+02 8.09e+02 7.19e+02 1.06e+02 2.71e+02 2.00e+01 6.15e+02 5.21e+02 1.60e+02 4.95e+02 9.56e+02 8.98e+02 5.75e+02 3.90e+02 9.25e+02 4.43e+02 8.52e+02 1.10e+02 3.43e+02 1.63e+02 2.08e+02 8.05e+02 8.18e+02 0.00e+00 7.44e+02 9.15e+02 9.13e+02 3.66e+02 3.80e+01 7.10e+01 6.70e+01 6.97e+02 3.75e+02 2.69e+02 4.27e+02 8.59e+02 4.74e+02 6.20e+01 1.02e+02 6.58e+02 8.71e+02 8.00e+00 9.04e+02 4.10e+01 2.31e+02 1.40e+01 7.00e+00 2.12e+02 7.40e+01 8.47e+02 1.80e+01 8.76e+02 5.53e+02 7.71e+02 3.27e+02 9.24e+02 7.45e+02 9.33e+02 5.05e+02 3.28e+02 2.60e+02 2.32e+02 2.18e+02 4.46e+02 1.50e+04 4.12e+02 7.87e+02 1.04e+02 1.38e+02 1.93e+02 9.90e+01 2.22e+02 5.41e+02 6.37e+02 2.15e+02 4.15e+02 7.15e+02 1.30e+02 9.89e+02 5.74e+02 4.03e+02 4.62e+02 2.42e+02 4.80e+01 7.69e+02 9.95e+02 7.26e+02 5.76e+02 5.59e+02 8.41e+02 9.92e+02 6.38e+02 1.70e+01 7.74e+02 4.00e+02 1.50e+02 3.24e+02 6.60e+01 4.07e+02 2.98e+02 8.54e+02 3.01e+02 2.46e+02 4.97e+02 5.80e+02 9.75e+02 9.70e+01 3.34e+02 5.31e+02 9.11e+02 3.59e+02 1.12e+02 9.12e+02 8.35e+02 3.54e+02 6.55e+02 5.52e+02 7.75e+02 9.40e+01 3.69e+02 6.25e+02 1.74e+02 5.20e+02 9.74e+02 2.63e+02 5.06e+02 2.50e+01 2.21e+02 1.60e+01 4.81e+02 6.68e+02 6.72e+02 7.23e+02 5.63e+02 8.67e+02 6.64e+02 6.93e+02 3.11e+02 7.30e+01 3.45e+02 4.75e+02 1.45e+02 4.05e+02 7.54e+02 3.57e+02 3.80e+02 5.79e+02 8.11e+02 6.00e+00 1.75e+02 2.26e+02 5.70e+01 7.51e+02 4.51e+02 7.67e+02 6.26e+02 4.60e+02 7.16e+02 4.88e+02 2.88e+02 8.64e+02 2.95e+02 6.42e+02 9.44e+02 3.00e+02 4.85e+02 2.04e+02 3.79e+02 8.96e+02 2.30e+02 2.06e+02 4.80e+02 9.06e+02 3.08e+02 6.52e+02 5.69e+02 9.47e+02 5.07e+02 6.80e+01 7.78e+02 1.96e+02 4.37e+02 2.80e+02 2.20e+01 3.61e+02 4.40e+02 3.18e+02 5.43e+02 5.12e+02 3.49e+02 8.39e+02 5.29e+02 5.19e+02 4.79e+02 6.22e+02 4.55e+02 5.80e+01 6.76e+02 6.50e+02 3.48e+02 1.94e+02 6.39e+02 7.52e+02 4.58e+02 7.29e+02 1.86e+02 6.65e+02 6.40e+02 7.20e+01 2.43e+02 3.40e+01 5.94e+02 3.52e+02 5.18e+02 2.87e+02 4.60e+01 1.77e+02 5.34e+02 5.65e+02 8.89e+02 4.20e+02 4.30e+01 7.53e+02 3.03e+02 3.04e+02 9.31e+02 6.34e+02 2.10e+01 7.50e+01 7.90e+01 7.28e+02 5.93e+02 5.97e+02 2.85e+02 8.78e+02 8.28e+02 4.17e+02 6.51e+02 5.84e+02 7.01e+02 2.77e+02 2.93e+02 8.74e+02 6.60e+02 1.32e+02 4.77e+02 1.44e+02 4.14e+02 6.79e+02 9.35e+02 3.68e+02 7.06e+02 3.23e+02 1.58e+02 7.17e+02 3.90e+01 5.00e+00 4.01e+02 5.11e+02 7.64e+02 3.98e+02 7.30e+02 7.95e+02 3.17e+02 6.00e+02 8.97e+02 5.92e+02 3.38e+02 8.30e+02 2.37e+02 2.35e+02 2.61e+02 8.21e+02 4.82e+02 2.30e+01 3.02e+02 6.77e+02 3.19e+02 3.64e+02 2.55e+02 6.28e+02 7.50e+02 5.14e+02 5.48e+02 6.31e+02 7.34e+02 5.01e+02 4.04e+02 1.67e+02 2.52e+02 5.10e+02 1.90e+01 1.00e+01 6.10e+01 8.65e+02 5.00e+01 1.14e+02 5.00e+02 7.89e+02 8.93e+02 6.43e+02 6.78e+02 2.25e+02 5.47e+02 7.49e+02 4.91e+02 2.92e+02 3.95e+02 2.75e+02 3.65e+02 5.20e+01 9.96e+02 9.60e+01 5.40e+01 2.96e+02 1.27e+02 2.51e+02 3.36e+02 8.10e+01 4.90e+01 9.23e+02 2.57e+02 4.00e+00 6.33e+02 1.19e+02 3.89e+02 4.18e+02 8.23e+02 2.11e+02 4.73e+02 2.14e+02 2.23e+02 6.40e+01 4.40e+01 3.53e+02 6.74e+02 6.16e+02 6.85e+02 3.51e+02 5.72e+02 7.62e+02 1.71e+02 6.13e+02 2.59e+02 2.19e+02 4.72e+02 2.00e+00 4.83e+02 5.96e+02 1.00e+02 7.85e+02 4.78e+02 4.94e+02 3.77e+02 1.08e+02 7.36e+02 7.24e+02 3.50e+01 6.44e+02 1.33e+02 4.50e+02 5.10e+01 5.38e+02 2.78e+02 9.45e+02 8.00e+01 4.26e+02 1.90e+02 6.46e+02 2.48e+02 3.63e+02 3.74e+02 2.29e+02 9.73e+02 6.30e+01 5.49e+02 3.44e+02 6.63e+02 1.36e+02 1.34e+02 4.48e+02 1.15e+02 4.50e+01 8.16e+02 2.39e+02 1.10e+01 1.78e+02 5.04e+02 7.96e+02 5.73e+02 3.55e+02 8.30e+01 1.42e+02 2.45e+02 3.20e+02 2.40e+01 6.70e+02 1.25e+02 9.00e+01 7.81e+02 8.02e+02 9.46e+02 3.60e+02 5.45e+02 7.86e+02 9.40e+02 9.66e+02 8.37e+02 3.70e+01 2.86e+02 4.24e+02 4.68e+02 3.42e+02 3.92e+02 2.73e+02 4.38e+02 6.90e+01 4.31e+02 9.00e+02 7.98e+02 8.43e+02 1.87e+02 7.00e+02 3.13e+02 3.73e+02 8.90e+01 2.50e+02 4.70e+01 3.25e+02 9.02e+02 4.53e+02 8.99e+02 2.28e+02 3.00e+00 1.37e+02 1.70e+04 5.28e+02 1.55e+02 2.60e+01 1.51e+02 7.11e+02 1.98e+02 1.64e+02 6.62e+02 1.88e+02 8.87e+02 9.17e+02 8.72e+02 3.67e+02 7.42e+02 9.60e+02 1.99e+02 7.43e+02 9.77e+02 9.80e+01 3.76e+02 3.47e+02 7.00e+01 3.99e+02 9.20e+01 3.09e+02 4.20e+01 4.92e+02 3.10e+01 6.11e+02 1.97e+02 2.54e+02 3.82e+02 8.80e+01 2.34e+02 8.60e+01 2.66e+02 2.90e+02 2.16e+02 4.06e+02 1.49e+02 2.20e+02 8.42e+02 6.05e+02 8.50e+01 4.87e+02 8.61e+02 1.09e+02 8.51e+02 3.32e+02 1.01e+02 9.49e+02 7.55e+02 5.87e+02 2.70e+02 7.56e+02 8.06e+02 4.70e+02 9.42e+02 9.05e+02 8.01e+02 4.99e+02 1.53e+02 7.83e+02 9.26e+02 1.81e+02 7.25e+02 9.00e+00 5.30e+01 3.06e+02 6.30e+02 3.56e+02 8.36e+02 2.70e+01 9.85e+02 4.57e+02 5.55e+02 6.06e+02 1.24e+02 3.71e+02 2.38e+02 9.22e+02 3.20e+01 7.57e+02 9.20e+02 1.70e+02 5.98e+02 8.49e+02 4.86e+02 5.83e+02 4.28e+02 5.78e+02 3.16e+02 8.88e+02 2.47e+02 3.05e+02 6.07e+02 7.31e+02 6.49e+02 6.50e+01 6.29e+02 7.82e+02 2.74e+02 2.24e+02 3.86e+02 2.89e+02 6.69e+02 1.21e+02 1.76e+02 4.96e+02 6.54e+02 1.69e+02 3.78e+02 9.30e+01 8.29e+02 4.35e+02 1.47e+02 7.93e+02 1.22e+02 6.96e+02 1.92e+02 1.46e+02 4.11e+02 6.32e+02 6.21e+02 6.47e+02 5.32e+02 8.60e+02 9.01e+02 1.39e+02 8.04e+02 5.36e+02 8.69e+02 4.44e+02 6.88e+02 1.31e+02 8.85e+02 8.75e+02 1.26e+02 7.70e+02 2.00e+02 5.13e+02 8.62e+02 9.08e+02 6.73e+02 2.91e+02 7.10e+02 2.09e+02 3.30e+01 1.89e+02 3.91e+02 8.90e+02 7.60e+01 4.10e+02 4.93e+02 6.89e+02 1.80e+02 3.81e+02 5.46e+02 1.52e+02 4.19e+02 6.94e+02 7.04e+02 6.57e+02 4.98e+02 2.05e+02 7.97e+02 5.56e+02 2.76e+02 6.86e+02 6.01e+02 3.39e+02 4.25e+02 7.12e+02 1.43e+02 6.82e+02 7.39e+02 9.69e+02 4.49e+02 4.69e+02 3.31e+02 4.54e+02 3.15e+02 6.98e+02 6.08e+02 1.66e+02 3.33e+02 3.12e+02 1.28e+02 7.72e+02 8.91e+02 3.96e+02 6.59e+02 6.20e+02 3.21e+02 1.85e+02 1.56e+02] actor_2_name: - Total de datos únicos: 3033 - Valores: ['Joel David Moore' 'Orlando Bloom' 'Rory Kinnear' ... 'Valorie Curry' 'Maxwell Moody' 'Brian Herzlinger'] actor_1_facebook_likes: - Total de datos únicos: 879 - Valores: [1.00e+03 4.00e+04 1.10e+04 2.70e+04 1.31e+02 6.40e+02 2.40e+04 7.99e+02 2.60e+04 2.50e+04 1.50e+04 1.80e+04 4.51e+02 2.20e+04 1.00e+04 5.00e+03 8.91e+02 1.60e+04 6.00e+03 2.90e+04 2.10e+04 1.40e+04 3.00e+03 8.83e+02 2.00e+04 1.20e+04 8.94e+02 9.74e+02 4.40e+04 2.30e+04 1.70e+04 3.40e+04 1.90e+04 9.79e+02 2.75e+02 2.00e+03 9.98e+02 2.68e+02 8.70e+04 7.11e+02 6.22e+02 4.00e+03 7.56e+02 9.75e+02 8.90e+02 6.48e+02 5.44e+02 5.31e+02 6.62e+02 4.90e+04 3.09e+02 2.34e+02 7.30e+02 9.21e+02 8.51e+02 7.69e+02 7.83e+02 1.30e+04 9.57e+02 8.20e+02 9.86e+02 7.66e+02 6.13e+02 9.82e+02 5.35e+02 6.88e+02 6.05e+02 8.45e+02 9.20e+02 7.84e+02 7.74e+02 8.86e+02 6.90e+02 2.73e+02 9.36e+02 8.33e+02 3.90e+01 6.50e+02 5.96e+02 8.11e+02 4.80e+02 6.69e+02 3.83e+02 6.81e+02 6.73e+02 7.60e+02 8.52e+02 5.00e+00 7.52e+02 7.80e+02 5.58e+02 9.25e+02 6.60e+02 8.27e+02 8.00e+03 4.90e+02 1.44e+02 6.23e+02 9.62e+02 8.73e+02 6.93e+02 7.70e+02 1.91e+02 8.35e+02 3.50e+04 6.91e+02 5.91e+02 1.92e+02 9.19e+02 6.80e+02 8.87e+02 2.83e+02 6.70e+02 8.79e+02 5.84e+02 6.72e+02 8.13e+02 6.11e+02 4.09e+02 8.75e+02 5.77e+02 8.98e+02 3.40e+02 9.00e+03 2.30e+01 9.81e+02 8.82e+02 7.43e+02 5.29e+02 2.00e+00 5.63e+02 6.70e+01 7.95e+02 1.63e+02 7.00e+03 6.10e+02 7.19e+02 8.67e+02 9.12e+02 3.74e+02 7.45e+02 5.82e+02 9.33e+02 5.48e+02 9.06e+02 9.40e+02 2.10e+01 7.10e+02 6.92e+02 8.70e+02 9.60e+02 4.43e+02 9.03e+02 9.89e+02 3.94e+02 3.49e+02 8.44e+02 5.76e+02 6.14e+02 9.66e+02 7.88e+02 5.37e+02 9.95e+02 9.01e+02 8.48e+02 6.00e+02 5.09e+02 3.24e+02 9.63e+02 1.77e+02 6.45e+02 9.67e+02 5.24e+02 4.60e+04 4.19e+02 9.88e+02 4.33e+02 6.58e+02 8.81e+02 8.54e+02 3.66e+02 8.34e+02 7.32e+02 7.89e+02 5.39e+02 8.74e+02 3.30e+04 3.10e+02 4.36e+02 5.10e+02 2.87e+02 8.65e+02 7.22e+02 9.68e+02 8.89e+02 9.73e+02 7.94e+02 9.70e+02 1.42e+02 9.31e+02 9.26e+02 8.47e+02 4.85e+02 5.34e+02 5.70e+01 8.38e+02 9.39e+02 9.56e+02 2.26e+02 6.17e+02 2.77e+02 5.93e+02 1.17e+02 6.31e+02 1.34e+02 2.11e+02 9.61e+02 6.77e+02 1.45e+02 6.25e+02 4.26e+02 1.54e+02 4.16e+02 4.95e+02 9.92e+02 4.37e+02 2.94e+02 6.27e+02 3.30e+02 3.25e+02 4.50e+04 4.30e+02 6.29e+02 5.10e+01 9.72e+02 9.84e+02 8.41e+02 5.99e+02 5.98e+02 7.79e+02 1.13e+02 6.78e+02 2.90e+01 9.24e+02 8.07e+02 5.45e+02 8.21e+02 5.21e+02 7.49e+02 5.50e+02 4.27e+02 9.13e+02 9.05e+02 7.23e+02 2.67e+02 1.64e+02 8.49e+02 2.19e+02 4.40e+01 4.05e+02 5.79e+02 4.68e+02 6.83e+02 9.22e+02 6.24e+02 3.26e+02 6.87e+02 1.07e+02 7.34e+02 6.68e+02 9.70e+01 8.55e+02 9.04e+02 7.00e+02 1.64e+05 5.78e+02 8.36e+02 7.08e+02 3.87e+02 9.54e+02 8.80e+02 4.72e+02 1.76e+02 7.20e+02 7.03e+02 7.87e+02 2.93e+02 6.07e+02 9.64e+02 3.27e+02 8.77e+02 9.78e+02 6.36e+02 7.82e+02 9.76e+02 6.10e+01 8.84e+02 4.99e+02 2.18e+02 1.52e+02 7.47e+02 7.55e+02 5.92e+02 6.43e+02 7.60e+01 3.08e+02 6.28e+02 3.70e+01 3.78e+02 7.86e+02 7.40e+02 4.63e+02 5.51e+02 1.80e+02 4.96e+02 7.46e+02 6.85e+02 9.44e+02 5.41e+02 4.61e+02 8.08e+02 7.68e+02 7.78e+02 4.92e+02 9.47e+02 9.00e+02 8.12e+02 8.26e+02 9.69e+02 8.97e+02 2.88e+02 3.96e+02 4.46e+02 7.75e+02 9.07e+02 4.65e+02 5.53e+02 6.94e+02 6.35e+02 6.95e+02 8.37e+02 3.00e+00 3.00e+01 6.21e+02 5.54e+02 3.72e+02 6.00e+00 8.18e+02 1.65e+02 7.73e+02 9.18e+02 3.44e+02 7.16e+02 6.49e+02 6.38e+02 8.39e+02 2.44e+02 6.16e+02 9.71e+02 4.42e+02 3.03e+02 5.06e+02 5.67e+02 6.96e+02 4.55e+02 5.85e+02 8.69e+02 5.81e+02 1.57e+02 6.63e+02 7.59e+02 4.89e+02 4.73e+02 4.48e+02 1.47e+02 8.76e+02 2.95e+02 1.14e+02 5.23e+02 7.57e+02 9.55e+02 5.04e+02 5.59e+02 6.39e+02 1.73e+02 1.10e+01 6.64e+02 5.80e+02 9.43e+02 7.31e+02 4.22e+02 4.97e+02 6.30e+01 8.06e+02 7.38e+02 5.94e+02 3.58e+02 3.86e+02 9.53e+02 6.60e+01 4.91e+02 7.48e+02 3.85e+02 3.32e+02 7.71e+02 6.42e+02 2.60e+02 6.55e+02 9.34e+02 2.64e+02 4.40e+02 7.13e+02 6.40e+05 7.42e+02 2.72e+02 7.44e+02 2.96e+02 3.28e+02 6.01e+02 2.85e+02 4.69e+02 4.60e+02 9.00e+01 6.18e+02 4.50e+01 4.62e+02 9.02e+02 7.98e+02 3.55e+02 9.85e+02 3.38e+02 5.62e+02 4.00e+02 1.72e+02 1.81e+02 9.41e+02 1.74e+02 8.43e+02 9.27e+02 9.08e+02 2.79e+02 7.90e+01 8.93e+02 5.12e+02 7.21e+02 5.27e+02 2.01e+02 8.05e+02 5.56e+02 1.29e+02 3.29e+02 8.96e+02 3.90e+02 1.50e+02 2.40e+02 2.03e+02 8.09e+02 2.00e+01 5.33e+02 9.80e+01 6.08e+02 5.26e+02 3.04e+02 5.08e+02 5.65e+02 8.60e+02 3.10e+04 1.41e+02 7.41e+02 1.25e+02 6.46e+02 0.00e+00 6.06e+02 9.91e+02 9.09e+02 6.51e+02 3.02e+02 1.37e+05 3.46e+02 8.56e+02 2.04e+02 9.97e+02 8.85e+02 8.29e+02 5.72e+02 8.01e+02 4.88e+02 6.99e+02 4.60e+01 1.06e+02 4.03e+02 4.83e+02 1.09e+02 7.29e+02 4.86e+02 8.88e+02 5.00e+02 3.41e+02 1.70e+01 9.48e+02 6.54e+02 8.20e+01 7.27e+02 8.28e+02 1.55e+02 9.46e+02 5.17e+02 2.10e+02 7.12e+02 4.56e+02 5.97e+02 3.64e+02 8.40e+01 4.44e+02 9.23e+02 1.58e+02 8.66e+02 5.49e+02 4.82e+02 4.53e+02 7.15e+02 7.67e+02 6.34e+02 2.14e+02 7.14e+02 2.27e+02 9.96e+02 2.41e+02 9.11e+02 8.61e+02 6.33e+02 1.75e+02 5.55e+02 5.03e+02 2.32e+02 2.35e+02 1.03e+02 9.40e+01 8.16e+02 1.16e+02 1.93e+02 4.71e+02 3.54e+02 7.96e+02 5.95e+02 6.50e+01 4.77e+02 8.57e+02 1.36e+02 6.40e+01 8.25e+02 7.76e+02 7.35e+02 1.79e+02 4.67e+02 8.92e+02 7.06e+02 7.18e+02 8.78e+02 4.13e+02 5.43e+02 1.59e+02 4.23e+02 1.49e+02 4.12e+02 3.80e+02 6.97e+02 7.00e+01 5.69e+02 6.56e+02 1.30e+02 6.19e+02 5.71e+02 8.99e+02 6.59e+02 8.03e+02 5.52e+02 3.06e+02 1.70e+02 7.70e+01 7.17e+02 9.30e+01 2.80e+01 4.52e+02 3.47e+02 7.33e+02 3.62e+02 5.25e+02 6.00e+01 7.97e+02 9.00e+00 6.86e+02 5.20e+01 3.99e+02 1.50e+01 2.15e+02 7.85e+02 2.76e+02 4.25e+02 9.17e+02 6.65e+02 3.81e+02 5.64e+02 6.80e+01 4.94e+02 7.26e+02 3.71e+02 9.29e+02 2.20e+02 2.58e+02 7.72e+02 2.40e+01 7.53e+02 8.23e+02 4.02e+02 7.54e+02 5.30e+02 3.07e+02 9.90e+02 6.03e+02 4.78e+02 7.80e+01 4.39e+02 1.22e+02 2.98e+02 3.00e+04 1.83e+02 3.98e+02 7.64e+02 5.57e+02 1.48e+02 1.87e+02 3.11e+02 3.92e+02 7.02e+02 7.00e+00 1.86e+02 6.79e+02 1.85e+02 5.73e+02 9.49e+02 3.50e+01 7.36e+02 2.05e+02 2.17e+02 1.33e+02 3.89e+02 4.49e+02 3.63e+02 4.84e+02 6.66e+02 4.34e+02 9.77e+02 5.02e+02 3.73e+02 7.10e+01 2.62e+02 5.32e+02 2.25e+02 5.13e+02 6.37e+02 2.46e+02 7.61e+02 2.06e+02 4.74e+02 4.00e+00 9.10e+01 9.90e+01 2.70e+01 8.04e+02 4.35e+02 5.60e+01 4.31e+02 1.80e+01 3.82e+02 3.21e+02 9.37e+02 2.63e+02 3.97e+02 6.82e+02 3.91e+02 3.43e+02 2.29e+02 2.50e+01 3.68e+02 3.10e+01 2.86e+02 3.93e+02 2.08e+02 9.35e+02 2.69e+02 5.20e+02 8.59e+02 5.50e+01 7.50e+01 1.97e+02 4.32e+02 3.31e+02 5.70e+02 2.37e+02 7.40e+01 1.88e+02 5.89e+02 2.74e+02 8.50e+01 4.64e+02 1.10e+02 7.07e+02 2.30e+02 5.22e+02 8.62e+02 4.20e+02 4.80e+01 9.45e+02 2.54e+02 7.62e+02 2.23e+02 7.63e+02 1.66e+02 5.07e+02 4.14e+02 1.19e+02 8.90e+01 5.75e+02 2.55e+02 1.95e+02 5.60e+02 6.32e+02 2.80e+02 4.38e+02 9.42e+02 2.53e+02 8.80e+01 5.16e+02 1.40e+02 6.41e+02 6.74e+02 5.74e+02 5.36e+02 4.00e+01 2.89e+02 4.87e+02 2.31e+02 6.52e+02 3.19e+02 2.16e+02 4.47e+02 1.05e+02 4.81e+02 1.23e+02 8.71e+02 5.00e+01 3.34e+02 1.27e+02 5.86e+02 6.75e+02 1.60e+01 7.20e+01 4.20e+01 6.76e+02 1.28e+02 1.20e+02 8.46e+02 3.14e+02 5.38e+02 2.51e+02 2.65e+02 4.59e+02 1.78e+02 1.68e+02 3.88e+02 1.89e+02 3.30e+01 3.53e+02 1.60e+02 1.21e+02 1.61e+02 4.10e+01 6.47e+02 7.01e+02 8.60e+01 2.81e+02 5.30e+01 4.21e+02 2.60e+05 3.80e+01 1.99e+02 5.01e+02 8.70e+01 2.82e+02 nan 3.40e+01 4.98e+02 5.11e+02 1.40e+01 1.20e+01 2.66e+02 4.30e+01 3.56e+02 2.02e+02 6.12e+02 2.45e+02 4.06e+02 1.02e+02 7.24e+02 5.90e+02 9.60e+01 4.58e+02 2.24e+02 3.35e+02 9.20e+01 6.57e+02 9.38e+02 2.84e+02 4.18e+02 4.90e+01 2.47e+02 3.12e+02 1.08e+02 8.00e+01 6.02e+02 5.87e+02 4.07e+02 1.96e+02 5.40e+01 6.90e+01 9.80e+02 3.42e+02 2.39e+02 3.22e+02 4.66e+02 8.00e+00 2.20e+01 3.61e+02 5.80e+01 5.90e+01 1.69e+02 4.70e+02 6.15e+02 3.59e+02 4.70e+01 1.90e+01 3.75e+02 3.18e+02 6.44e+02 1.56e+02 3.60e+02 2.50e+02 7.93e+02 1.00e+02 1.35e+02 3.37e+02 2.36e+02 8.10e+01 8.14e+02 7.28e+02 7.70e+04 2.59e+02 1.38e+02 5.05e+02 3.60e+01 2.70e+02 3.76e+02 2.00e+02 8.30e+02 1.26e+02 1.43e+02 2.38e+02 1.18e+02 1.11e+02 2.60e+01 7.25e+02 2.52e+02 3.20e+01 3.13e+02 3.70e+02 6.30e+02 1.00e+01 2.91e+02] gross: - Total de datos únicos: 4036 - Valores: [7.60505847e+08 3.09404152e+08 2.00074175e+08 ... 4.58400000e+03 1.04430000e+04 8.52220000e+04] genres: - Total de datos únicos: 914 - Valores: ['Action|Adventure|Fantasy|Sci-Fi' 'Action|Adventure|Fantasy' 'Action|Adventure|Thriller' 'Action|Thriller' 'Documentary' 'Action|Adventure|Sci-Fi' 'Action|Adventure|Romance' 'Adventure|Animation|Comedy|Family|Fantasy|Musical|Romance' 'Adventure|Family|Fantasy|Mystery' 'Action|Adventure' 'Action|Adventure|Western' 'Action|Adventure|Family|Fantasy' 'Action|Adventure|Comedy|Family|Fantasy|Sci-Fi' 'Adventure|Fantasy' 'Action|Adventure|Drama|History' 'Adventure|Family|Fantasy' 'Action|Adventure|Drama|Romance' 'Drama|Romance' 'Action|Adventure|Sci-Fi|Thriller' 'Action|Adventure|Fantasy|Romance' 'Action|Adventure|Fantasy|Sci-Fi|Thriller' 'Adventure|Animation|Comedy|Family|Fantasy' 'Adventure|Animation|Comedy|Family|Sport' 'Action|Crime|Thriller' 'Action|Adventure|Horror|Sci-Fi|Thriller' 'Adventure|Animation|Family|Sci-Fi' 'Action|Comedy|Crime|Thriller' 'Animation|Drama|Family|Fantasy' 'Action|Crime|Drama|Thriller' 'Adventure|Animation|Comedy|Family' 'Action|Adventure|Animation|Comedy|Family|Sci-Fi' 'Adventure|Drama|Family|Mystery' 'Action|Comedy|Sci-Fi|Western' 'Action|Adventure|Fantasy|Horror|Thriller' 'Action|Adventure|Comedy|Sci-Fi' 'Comedy|Family|Fantasy' 'Adventure|Animation|Comedy|Drama|Family|Fantasy' 'Adventure|Drama|Family|Fantasy' 'Action|Adventure|Drama|Fantasy' 'Action|Adventure|Family|Fantasy|Romance' 'Action|Adventure|Drama|Sci-Fi' 'Action|Adventure|Romance|Sci-Fi' 'Action|Adventure|Family|Mystery|Sci-Fi' 'Action|Adventure|Animation|Comedy|Drama|Family|Sci-Fi' 'Adventure|Animation|Comedy|Family|Sci-Fi' 'Adventure|Animation|Family|Fantasy' 'Action|Sci-Fi' 'Adventure|Drama|Sci-Fi' 'Action|Adventure|Drama|Horror|Sci-Fi' 'Drama|Fantasy|Romance' 'Adventure|Sci-Fi' 'Action|Adventure|Drama|Thriller' 'Action|Drama|History|Romance|War' 'Action|Adventure|Biography|Drama|History|Romance|War' 'Action|Drama' 'Drama|Horror|Sci-Fi' 'Adventure|Comedy|Family|Fantasy' 'Animation|Comedy|Family|Fantasy' 'Action|Adventure|Animation|Comedy|Family' 'Adventure|Animation|Comedy|Family|Fantasy|Musical' 'Mystery|Thriller' 'Adventure|Animation|Comedy|Drama|Family' 'Action|Adventure|Animation|Comedy|Family|Fantasy|Sci-Fi' 'Comedy|Fantasy|Horror' 'Drama|Fantasy|Horror|Thriller' 'Action|Drama|Thriller' 'Adventure' 'Action|Comedy|Fantasy|Sci-Fi' 'Action|Adventure|Comedy|Family|Fantasy|Mystery|Sci-Fi' 'Action|Adventure|Animation|Fantasy' 'Comedy|Crime' 'Action|Drama|History|War' 'Action|Adventure|Drama' 'Action|Adventure|Animation|Comedy|Family|Fantasy' 'Action|Drama|Mystery|Sci-Fi' 'Action|Adventure|Comedy|Thriller' 'Action|Adventure|Animation|Fantasy|Romance|Sci-Fi' 'Action|Adventure|Drama|History|War' 'Adventure|Drama|Fantasy|Romance' 'Animation|Comedy|Family|Musical' 'Action|Crime|Drama|Mystery|Thriller' 'Adventure|Drama|Thriller|Western' 'Adventure|Animation|Comedy|Family|Western' 'Action|Mystery|Thriller' 'Adventure|Sci-Fi|Thriller' 'Adventure|Animation|Comedy|Family|Fantasy|Sci-Fi' 'Action|Crime|Mystery|Thriller' 'Action|Adventure|Family|Mystery' 'Adventure|Drama|Romance|War' 'Adventure|Animation|Family|Thriller' 'Action|Fantasy' 'Action|Animation|Comedy|Family|Sci-Fi' 'Action|Comedy|Fantasy' 'Fantasy' 'Adventure|Animation|Comedy|Family|Musical' 'Action|Adventure|Crime|Mystery|Thriller' 'Action|Adventure|History' 'Action' 'Adventure|Drama|Fantasy' 'Action|Fantasy|Thriller' 'Action|Adventure|Comedy|Crime' 'Adventure|Mystery|Sci-Fi' 'Action|Drama|Sci-Fi|Thriller' 'Action|Crime|Sci-Fi|Thriller' 'Action|Family|Sport' 'Comedy|Drama|Romance' 'Action|Comedy|Romance' 'Action|Adventure|Mystery|Sci-Fi' 'Action|Drama|War' 'Adventure|Drama|Sci-Fi|Thriller' 'Action|Adventure|Comedy|Family|Fantasy' 'Crime|Thriller' 'Action|Comedy|Crime|Romance|Thriller' 'Biography|Drama' 'Action|Comedy|Crime|Sci-Fi|Thriller' 'Action|Adventure|Crime' 'Action|Drama|Fantasy|War' 'Animation|Comedy|Family|Music|Western' 'Action|Adventure|Mystery|Sci-Fi|Thriller' 'Action|Drama|Sci-Fi|Sport' 'Action|Crime|Romance|Thriller' 'Action|Adventure|Comedy' 'Biography|Drama|Sport' 'Action|Mystery|Sci-Fi|Thriller' 'Animation|Family|Fantasy|Musical|Romance' 'Comedy' 'Action|Adventure|Romance|Sci-Fi|Thriller' 'Comedy|Romance' 'Action|Drama|Romance' 'Biography|Crime|Drama|History|Romance' 'Biography|Crime|Drama' 'Action|Comedy|Thriller' 'Action|Comedy|Crime' 'Action|Drama|Mystery|Thriller' 'Drama|Western' 'Animation|Drama|Family|Musical|Romance' 'Action|Adventure|Comedy|Family|Mystery' 'Action|Romance|Thriller' 'Action|Fantasy|Horror|Mystery' 'Adventure|Drama|Thriller' 'Biography|Comedy|Crime|Drama' 'Action|Sci-Fi|War' 'Drama|Sci-Fi' 'Action|Adventure|Animation|Family|Fantasy' 'Action|Crime|Fantasy|Romance|Thriller' 'Adventure|Comedy|Sci-Fi' 'Action|Crime|Sport|Thriller' 'Action|Adventure|Biography|Drama|History|Thriller' 'Action|Comedy|Sci-Fi' 'Action|Drama|Thriller|War' 'Drama|Mystery|Thriller' 'Action|Adventure|Fantasy|Thriller' 'Crime|Drama' 'Drama|History|Romance|War' 'Animation|Comedy|Family|Sport' 'Comedy|Sci-Fi|Thriller' 'Drama|History|War' 'Adventure|Animation|Comedy|Family|Romance' 'Drama|Family|Fantasy|Romance' 'Drama|Fantasy|Thriller' 'Drama|Mystery|Romance|Sci-Fi|Thriller' 'Drama|History|War|Western' 'Action|Adventure|Animation|Family' 'Adventure|Comedy|Family|Mystery|Sci-Fi' 'Drama|Fantasy|Horror|Mystery|Thriller' 'Animation|Comedy|Family|Sci-Fi' 'Adventure|Comedy|Drama|Fantasy|Romance' 'Action|Adventure|Comedy|Crime|Thriller' 'Crime|Drama|Thriller' 'Adventure|Animation|Family|Fantasy|Musical|War' 'Action|Comedy' 'Crime|Drama|Mystery|Thriller' 'Adventure|Drama|History' 'Action|Adventure|Animation|Family|Fantasy|Sci-Fi' 'Adventure|Animation|Comedy|Family|Fantasy|Music' 'Drama|History|Thriller|War' 'Action|Animation|Comedy|Sci-Fi' 'Comedy|Family|Fantasy|Horror|Mystery' 'Drama|Mystery|Sci-Fi|Thriller' 'Action|Horror|Sci-Fi|Thriller' 'Crime|Mystery|Thriller' 'Action|Adventure|Comedy|Crime|Mystery|Thriller' 'Comedy|Drama|Sci-Fi' 'Action|Family|Fantasy|Musical' 'Drama|History|Sport' 'Adventure|Drama|Romance' 'Animation|Comedy|Family|Music|Romance' 'Animation|Comedy|Family|Fantasy|Musical|Romance' 'Crime|Drama|Horror|Mystery|Thriller' 'Adventure|Comedy|Family' 'Action|Adventure|Comedy|Fantasy' 'Comedy|Drama|Music|Musical' 'Adventure|Comedy|Drama|Family|Fantasy' 'Action|Comedy|Fantasy|Romance' 'Comedy|Romance|Sci-Fi' 'Adventure|Comedy|Mystery' 'Comedy|Drama|Fantasy|Romance' 'Action|Comedy|Family|Fantasy' 'Action|Adventure|Fantasy|Horror|Sci-Fi' 'Crime|Drama|History|Mystery|Thriller' 'Comedy|Drama' 'Adventure|Animation|Comedy|Drama|Family|Fantasy|Sci-Fi' 'Action|Drama|Romance|Sci-Fi|Thriller' 'Comedy|Crime|Sport' 'Comedy|Family|Fantasy|Romance' 'Action|Adventure|Crime|Drama|Sci-Fi|Thriller' 'Adventure|Drama|History|Romance|War' 'Comedy|Family|Sci-Fi' 'Fantasy|Horror|Mystery|Thriller' 'Adventure|Animation|Comedy|Family|Fantasy|Sci-Fi|Sport' 'Adventure|Comedy|Crime|Family|Mystery' 'Drama|Sci-Fi|Thriller' 'Action|Crime|Mystery|Romance|Thriller' 'Action|Adventure|Comedy|Romance' 'Adventure|Animation|Family|Western' 'Comedy|Family|Romance' 'Action|Adventure|Family|Sci-Fi|Thriller' 'Animation|Family|Fantasy|Music' 'Action|Adventure|Family|Fantasy|Thriller' 'Comedy|Fantasy' 'Action|Adventure|Comedy|Fantasy|Thriller' 'Drama|Horror|Mystery|Sci-Fi' 'Action|Sci-Fi|Thriller' 'Drama|History|Thriller' 'Adventure|Animation|Family' 'Drama|Musical|Romance' 'Documentary|Drama' 'Action|Adventure|Drama|History|Romance' 'Animation|Family' 'Adventure|Animation|Drama|Family|Musical' 'Animation|Comedy|Family|Fantasy|Sci-Fi' 'Adventure|Animation|Drama|Family|Fantasy' 'Sci-Fi|Thriller' 'Animation|Comedy|Family' 'Action|Crime|Fantasy|Thriller' 'Comedy|Drama|Family|Music|Musical|Romance' 'Horror|Mystery|Thriller' 'Action|Adventure|Comedy|Family|Sci-Fi' 'Comedy|Family' 'Biography|Comedy|Drama|History' 'Drama|Music|Musical' 'Crime|Drama|Mystery' 'Comedy|Crime|Music' 'Action|Comedy|Romance|Thriller' 'Animation|Comedy|Family|Fantasy|Mystery' 'Comedy|Crime|Drama|Romance' 'Action|Adventure|Romance|Thriller' 'Drama|History|Romance' 'Action|Drama|Fantasy|Romance' 'Action|Adventure|Animation|Family|Sci-Fi' 'Action|Drama|Sci-Fi' 'Drama|Horror|Sci-Fi|Thriller' 'Animation|Comedy|Fantasy' 'Action|Animation|Comedy|Family' 'Action|Adventure|Comedy|Romance|Thriller' 'Action|Comedy|Sport' 'Biography|Drama|History|War' 'Adventure|Animation|Comedy' 'Action|Drama|Sport' 'Adventure|Drama|Family' 'Drama|Mystery|Romance|Thriller' 'Adventure|Animation|Comedy|Family|Fantasy|Romance' 'Adventure|Drama|War' 'Action|Adventure|Crime|Thriller' 'Adventure|Drama|Fantasy|Mystery|Thriller' 'Fantasy|Mystery|Romance|Sci-Fi|Thriller' 'Drama|Fantasy|Mystery|Thriller' 'Animation|Comedy|Family|Fantasy|Music' 'Drama|Horror|Romance|Thriller' 'Drama|War' 'Drama' 'Action|Drama|Fantasy|Horror|War' 'Adventure|Family|Fantasy|Romance' 'Adventure|Biography|Drama|History|War' 'Action|Adventure|Horror|Sci-Fi' 'Action|Fantasy|Horror' 'Comedy|Drama|Musical|Romance' 'Action|Sci-Fi|Sport' 'Action|Adventure|Animation|Comedy|Crime|Family|Fantasy' 'Adventure|Animation|Family|Fantasy|Musical' 'Action|Crime|Mystery|Sci-Fi|Thriller' 'Action|Comedy|Crime|Drama|Thriller' 'Adventure|Drama|History|Romance' 'Biography|Drama|Thriller' 'Action|Drama|History|Thriller' 'Action|Adventure|Fantasy|War' 'Comedy|Fantasy|Romance' 'Action|Adventure|Comedy|Romance|Thriller|Western' 'Biography|Drama|Sport|War' 'Comedy|Drama|Family|Musical' 'Action|Adventure|Fantasy|Horror|Sci-Fi|Thriller' 'Drama|Sport' 'Action|Fantasy|Sci-Fi|Thriller' 'Drama|Mystery|Romance' 'Adventure|Biography|Drama|History|Sport|Thriller' 'Crime|Drama|Fantasy' 'Adventure|Biography|Crime|Drama|Western' 'Action|War' 'Comedy|Romance|Sport' 'Crime|Drama|Mystery|Thriller|Western' 'Comedy|Sport' 'Comedy|Drama|Family' 'Crime|Drama|Fantasy|Mystery' 'Adventure|Animation|Biography|Drama|Family|Fantasy|Musical' 'Drama|Romance|Western' 'Documentary|Music' 'Drama|Thriller' 'Animation|Family|Fantasy' 'Action|Fantasy|Horror|Sci-Fi' 'Biography|Comedy|Drama' 'Action|Horror|Sci-Fi' 'Adventure|Comedy' 'Biography|Drama|History|Sport' 'Comedy|Crime|Romance|Thriller' 'Comedy|Crime|Romance' 'Horror|Mystery|Sci-Fi|Thriller' 'Biography|Drama|Music' 'Drama|Fantasy|Sport' 'Adventure|Comedy|Drama|Music' 'Action|Fantasy|Horror|Sci-Fi|Thriller' 'Adventure|Animation|Comedy|Drama|Family|Fantasy|Romance' 'Horror|Sci-Fi|Thriller' 'Drama|Fantasy|Mystery|Romance|Thriller' 'Action|Adventure|Drama|History|Romance|War' 'Drama|Fantasy|Mystery|Romance' 'Fantasy|Horror|Mystery|Romance' 'Adventure|Comedy|Family|Romance|Sci-Fi' 'Drama|Horror|Thriller' 'Action|Comedy|Mystery|Romance' 'Action|Adventure|Comedy|Romance|Sci-Fi' 'Action|Biography|Drama|History|Thriller|War' 'Adventure|Comedy|Family|Fantasy|Horror' 'Comedy|Family|Romance|Sci-Fi' 'Action|Adventure|Thriller|War' 'Comedy|Drama|Romance|Sport' 'Comedy|Western' 'Action|Comedy|Crime|Drama' 'Drama|Music|Romance|War' 'Action|Comedy|Drama|Family|Thriller' 'Action|Crime' 'Adventure|Animation|Drama|Family|History|Musical|Romance' 'Action|Adventure|Drama|Romance|Sci-Fi' 'Action|Adventure|Comedy|Family|Romance' 'Action|Adventure|Comedy|Western' 'Biography|Drama|History|Musical' 'Adventure|Drama|Horror|Thriller' 'Action|Drama|Sport|Thriller' 'Drama|Musical|Romance|Thriller' 'Comedy|Drama|Family|Fantasy' 'Adventure|Comedy|Crime|Family|Musical' 'Drama|Music|Musical|Romance' 'Drama|Mystery|Romance|War' 'Crime|Drama|Romance' 'Crime|Horror|Mystery|Thriller' 'Adventure|Animation|Drama|Family|Fantasy|Musical|Mystery|Romance' 'Action|Horror|Thriller' 'Drama|History|Horror' 'Drama|Romance|Sport' 'Comedy|Family|Musical|Romance' 'Romance|Sci-Fi|Thriller' 'Biography|Comedy|Drama|Romance' 'Mystery|Sci-Fi|Thriller' 'Drama|Fantasy|Horror' 'Adventure|Comedy|Drama|Fantasy|Musical' 'Horror|Mystery' 'Action|Adventure|Family|Fantasy|Sci-Fi|Thriller' 'Adventure|Comedy|Family|Fantasy|Romance|Sport' 'Adventure|Horror|Mystery' 'Crime|Drama|Romance|Thriller' 'Comedy|Crime|Drama|Thriller' 'Drama|Fantasy' 'Adventure|Comedy|Drama' 'Action|Biography|Drama|History|War' 'Adventure|Comedy|Fantasy' 'Adventure|Comedy|Crime|Drama|Family' 'Action|Biography|Crime|Drama|Thriller' 'Comedy|Sci-Fi' 'Drama|Romance|Sci-Fi' 'Action|Adventure|Comedy|Crime|Music|Mystery' 'Comedy|Drama|Music' 'Action|Crime|Drama|Sci-Fi|Thriller' 'Horror|Thriller' 'Action|Adventure|Comedy|Drama|War' 'Drama|Mystery|Sci-Fi' 'Crime|Drama|Music' 'Adventure|Crime|Drama|Western' 'Comedy|Drama|Thriller' 'Drama|Romance|War' 'Action|Comedy|Crime|Music|Romance|Thriller' 'Crime|Romance|Thriller' 'Action|Adventure|Drama|Sci-Fi|Thriller' 'Action|Drama|Fantasy|Thriller|Western' 'Action|Drama|Mystery|Thriller|War' 'Biography|Crime|Drama|Thriller' 'Action|Comedy|Crime|Romance' 'Action|Adventure|Family|Fantasy|Sci-Fi' 'Adventure|Comedy|Family|Musical' 'Action|Horror' 'Action|Adventure|Horror|Thriller' 'Comedy|Drama|Music|Romance' 'Action|Crime|Drama|Romance|Thriller' 'Comedy|Family|Romance|Sport' 'Drama|Family|Fantasy' 'Drama|Fantasy|Musical|Romance' 'Adventure|Comedy|Family|Fantasy|Sci-Fi' 'Comedy|Musical' 'Biography|Drama|History' 'Action|Crime|Drama|Thriller|War' 'Comedy|Crime|Thriller' 'Drama|Fantasy|Horror|Mystery' 'Action|Animation|Comedy|Family|Fantasy' 'Biography|Drama|History|Thriller' 'Action|Adventure|Crime|Drama|Mystery|Thriller' 'Animation|Family|Fantasy|Musical' 'Adventure|Drama|Western' 'Biography|Drama|History|Romance' 'Drama|Horror|Mystery|Thriller' 'Action|Fantasy|Western' 'Comedy|War' 'Drama|Music' 'Action|Drama|Family|Sport' 'Action|Biography|Drama|Thriller|War' 'Comedy|Drama|Sport' 'Adventure|Comedy|Sci-Fi|Western' 'Fantasy|Horror|Romance' 'Biography|Drama|Romance' 'Action|Adventure|Drama|Romance|War' 'Adventure|Comedy|Crime|Romance' 'Comedy|Drama|Family|Fantasy|Romance' 'Horror' 'Comedy|Music' 'Action|Adventure|Drama|Romance|Thriller' 'Biography|Drama|Music|Musical' 'Drama|History' 'Comedy|Music|Romance' 'Action|Adventure|Crime|Fantasy|Mystery|Thriller' 'Adventure|Drama|Mystery' 'Biography|Crime|Drama|Music' 'Crime|Drama|Horror|Thriller' 'Adventure|Animation|Comedy|Drama|Family|Fantasy|Musical' 'Action|Adventure|Comedy|Music|Thriller' 'Adventure|Animation|Comedy|Crime|Family' 'Comedy|Romance|Sci-Fi|Thriller' 'Comedy|Crime|Family|Romance' 'Crime|Horror|Thriller' 'Action|Horror|Mystery|Sci-Fi|Thriller' 'Comedy|Fantasy|Sci-Fi' 'Adventure|Animation|Comedy|Fantasy|Romance' 'Action|Adventure|Family|Thriller' 'Adventure|Comedy|Drama|Romance|Thriller|War' 'Adventure|Animation|Comedy|Fantasy|Music|Romance' 'Action|Drama|Fantasy' 'Action|Adventure|Drama|Fantasy|War' 'Drama|Fantasy|Romance|Sci-Fi' 'Animation|Comedy|Family|Horror|Sci-Fi' 'Biography|Drama|Romance|Sport' 'Action|Biography|Drama' 'Adventure|Drama' 'Horror|Mystery|Sci-Fi' 'Action|Adventure|Drama|Thriller|Western' 'Adventure|Family|Fantasy|Sci-Fi' 'Adventure|Comedy|History|Romance' 'Action|Biography|Drama|Sport' 'Drama|Family' 'Action|Adventure|Crime|Drama|Family|Fantasy|Romance|Thriller' 'Biography|Comedy|Romance' 'Action|Biography|Drama|History' 'Biography|Drama|War' 'Adventure|Comedy|Family|Sci-Fi' 'Biography|Drama|Family|History|Sport' 'Biography|Comedy|Drama|History|Music' 'Fantasy|Horror' 'Comedy|Drama|Family|Sport' 'Comedy|Drama|Romance|Sci-Fi' 'Adventure|Animation|Comedy|Family|War' 'Action|Comedy|Sci-Fi|Thriller' 'Comedy|Horror' 'Drama|Thriller|War' 'Action|Western' 'Action|Adventure|Family|Sci-Fi' 'Adventure|Biography|Drama|Thriller' 'Drama|Romance|War|Western' 'Action|Comedy|Crime|Western' 'Action|Adventure|Comedy|Drama|Thriller' 'Drama|Music|Romance' 'Action|Adventure|Crime|Drama|Thriller' 'Adventure|Comedy|Family|Sport' 'Comedy|Drama|Fantasy' 'Comedy|Family|Sport' 'Action|Adventure|Drama|Family' 'Action|Comedy|War' 'Drama|Family|Sport' 'Action|Thriller|Western' 'Action|Drama|Fantasy|Horror|Thriller' 'Animation|Comedy|Family|Fantasy|Musical' 'Action|Adventure|Comedy|Fantasy|Romance' 'Action|Crime|Drama|Mystery|Sci-Fi|Thriller' 'Adventure|Comedy|Crime|Drama' 'Drama|Mystery' 'Comedy|Fantasy|Horror|Thriller' 'Crime|Drama|Mystery|Sci-Fi|Thriller' 'Comedy|Crime|Musical' 'Comedy|Drama|Family|Music|Romance' 'Comedy|Horror|Romance' 'Comedy|Family|Fantasy|Sport' 'Animation|Comedy|Family|Mystery|Sci-Fi' 'Adventure|Comedy|Drama|Family|Sport' 'Animation|Drama|Family|Fantasy|Musical|Romance' 'Comedy|Horror|Musical|Sci-Fi' 'Crime|Drama|Sport' 'Action|Adventure|Animation|Drama|Mystery|Sci-Fi|Thriller' 'Action|Adventure|Crime|Drama|Romance' 'Action|Comedy|Horror' 'Adventure|Horror|Thriller' 'Adventure|Fantasy|Mystery' 'Action|Drama|Romance|Sport' 'Biography|Crime|Drama|History|Western' 'Action|Biography|Crime|Drama' 'Adventure|Animation|Fantasy' 'Adventure|Animation|Comedy|Fantasy' 'Biography|Drama|Music|Romance' 'Adventure|Drama|Mystery|Sci-Fi|Thriller' 'Biography|Comedy|Crime|Drama|Romance|Thriller' 'Biography|Crime|Drama|History|Music' 'Adventure|Animation|Comedy|Drama|Family|Musical' 'Biography|Comedy|Drama|Music|Romance' 'Adventure|Animation|Sci-Fi' 'Drama|Romance|Thriller' 'Action|Fantasy|Horror|Thriller' 'Adventure|Biography' 'Action|Comedy|Family' 'Action|Horror|Romance' 'Adventure|Drama|History|Romance|Thriller|War' 'Crime|Drama|Sci-Fi|Thriller' 'Action|Comedy|Crime|Music' 'Comedy|Drama|Family|Romance' 'Action|Drama|Fantasy|Mystery|Sci-Fi|Thriller' 'Adventure|Family|Fantasy|Horror|Mystery' 'Action|Crime|Drama|History|Western' 'Comedy|Crime|Drama' 'Comedy|Family|Fantasy|Music|Romance' 'Adventure|Comedy|Crime|Music' 'Action|Adventure|Comedy|Sci-Fi|Thriller' 'Action|Crime|Drama|Western' 'Action|Adventure|Comedy|Family|Romance|Sci-Fi' 'Action|Fantasy|Romance|Sci-Fi' 'Comedy|Crime|Mystery|Romance' 'Adventure|Family' 'Action|Drama|Music|Romance' 'Adventure|Comedy|Family|Fantasy|Horror|Mystery' 'Adventure|Fantasy|Mystery|Thriller' 'Action|Biography|Drama|History|Romance|Western' 'Fantasy|Horror|Mystery' 'Biography|Drama|Family' 'Action|Adventure|Comedy|Crime|Family|Romance|Thriller' 'Comedy|Fantasy|Horror|Romance' 'Comedy|Family|Music' 'Action|Comedy|Music' 'Adventure|Comedy|Crime' 'Biography|Comedy|Drama|Sport' 'Fantasy|Horror|Thriller' 'Comedy|Drama|Romance|Thriller' 'Adventure|Comedy|Family|Romance' 'Adventure|Family|Fantasy|Musical' 'Biography|Crime|Drama|History|Thriller' 'Action|Animation|Comedy|Family|Fantasy|Sci-Fi' 'Crime|Drama|History' 'Biography|Drama|Thriller|War' 'Drama|Music|Mystery|Romance|Thriller' 'Action|Adventure|Fantasy|Horror' 'Crime|Drama|Mystery|Romance' 'Action|Adventure|History|Romance' 'Action|Drama|Western' 'Adventure|Comedy|Family|Fantasy|Music|Sci-Fi' 'Adventure|Family|Fantasy|Music|Musical' 'Action|Adventure|Animation|Comedy|Fantasy' 'Adventure|Comedy|Horror|Sci-Fi' 'Horror|Sci-Fi' 'Biography|Comedy|Drama|Family|Sport' 'Action|Crime|Drama|Thriller|Western' 'Action|Drama|History' 'Drama|Fantasy|Romance|Thriller' 'Thriller' 'Comedy|Mystery' 'Comedy|Drama|Musical|Romance|War' 'Drama|History|Music|Romance|War' 'Comedy|History' 'Adventure|Animation|Family|Sport' 'Animation|Comedy|Fantasy|Musical' 'Game-Show|Reality-TV|Romance' 'Action|Comedy|Documentary' 'Adventure|Comedy|Drama|Family|Romance' 'Adventure|Comedy|Drama|Family|Mystery' 'Drama|Family|Music|Romance' 'Fantasy|Romance' 'Adventure|Animation|Family|Musical' 'Animation|Comedy|Drama|Family|Musical' 'Biography|Crime|Drama|History' 'Adventure|Comedy|Fantasy|Music|Sci-Fi' 'Comedy|Drama|Musical|Romance|Western' 'Action|Adventure|Drama|Mystery' 'Comedy|Crime|Family|Mystery|Romance|Thriller' 'Action|Adventure|Drama|Romance|Western' 'Adventure|Crime|Mystery|Sci-Fi|Thriller' 'Crime|Drama|Western' 'Adventure|Comedy|Drama|Fantasy' 'Adventure|Biography|Drama' 'Adventure|Drama|Horror|Mystery|Thriller' 'Crime|Fantasy|Horror' 'Animation|Family|Fantasy|Mystery' 'Action|Comedy|Crime|Fantasy' 'Comedy|Family|Music|Musical' 'Crime|Documentary|News' 'Drama|Mystery|Romance|Thriller|War' 'Action|Crime|Drama|Sport' 'Comedy|Drama|Music|War' 'Comedy|Musical|Romance' 'Comedy|Drama|Music|Musical|Romance' 'Comedy|Crime|Drama|Mystery|Romance' 'Biography|Comedy|Drama|History|Music|Musical' 'Animation|Drama|Mystery|Sci-Fi|Thriller' 'Adventure|Comedy|Drama|Romance' 'Comedy|Drama|Mystery|Romance|Thriller|War' 'Biography|Comedy|Musical' 'Action|Adventure|Animation|Family|Sci-Fi|Thriller' 'Crime|Drama|Mystery|Romance|Thriller' 'Comedy|Family|Fantasy|Sci-Fi' 'Action|Comedy|Crime|Fantasy|Horror|Mystery|Sci-Fi|Thriller' 'Romance|Short' 'Animation' 'Drama|Horror' 'Comedy|Drama|Reality-TV|Romance' 'Adventure|Comedy|Romance' 'Family|Fantasy|Music' 'Crime|Drama|Music|Thriller' 'Action|Drama|Fantasy|Mystery|Thriller' 'Biography|Drama|History|Music' 'Biography|Drama|Family|Sport' 'Comedy|Drama|War' 'Biography|Drama|Romance|War' 'Action|Horror|Romance|Sci-Fi|Thriller' 'Music' 'Action|Drama|History|Romance|War|Western' 'Action|Animation|Sci-Fi|Thriller' 'Action|Animation|Comedy|Crime|Family' 'Drama|Family|Music|Musical' 'Drama|Family|Musical|Romance' 'Comedy|Drama|Family|Fantasy|Sci-Fi' 'Comedy|Crime|Drama|Music|Romance' 'Adventure|Comedy|Family|Fantasy|Musical' 'Adventure|Crime|Drama|Romance' 'Comedy|Mystery|Sci-Fi|Thriller' 'Sci-Fi' 'Drama|Fantasy|War' 'Action|Comedy|Crime|Family' 'Action|Comedy|Mystery' 'Comedy|Crime|Mystery' 'Action|Crime|Sci-Fi' 'Comedy|Horror|Sci-Fi' 'Action|Comedy|Drama|Thriller' 'Drama|Family|Romance' 'Adventure|Comedy|Family|Music|Romance' 'Comedy|Horror|Thriller' 'Comedy|Family|Music|Romance' 'Adventure|Fantasy|Horror|Mystery|Thriller' 'Crime|Drama|Musical|Romance' 'Family|Music|Romance' 'Drama|Fantasy|Mystery|Sci-Fi' 'Biography|Drama|History|Thriller|War' 'Adventure|Crime|Drama|Mystery|Western' 'Drama|Fantasy|Horror|Romance' 'Comedy|Crime|Drama|Thriller|War' 'Action|Adventure|Drama|History|Thriller|War' 'Action|Comedy|Drama|War' 'Comedy|Drama|Fantasy|Music|Romance' 'Biography|Drama|Fantasy|History' 'Biography' 'Drama|Family|Music' 'Adventure|Mystery|Thriller' 'Comedy|Mystery|Romance' 'Biography|Crime|Drama|War' 'Crime|Drama|Music|Mystery|Thriller' 'Biography|Comedy|Drama|War' 'Comedy|Crime|Family|Sci-Fi' 'Adventure|Family|Sci-Fi' 'Adventure|Comedy|Romance|Sci-Fi' 'Action|Adventure|Comedy|Family' 'Biography|Comedy|Crime|Drama|Romance' 'Crime|Drama|Musical' 'Animation|Comedy|Crime|Drama|Family' 'Action|Adventure|Comedy|Fantasy|Mystery' 'Action|Adventure|Drama|Thriller|War' 'Crime|Drama|Music|Romance' 'Adventure|Animation|Comedy|Crime' 'Adventure|Comedy|Fantasy|Sci-Fi' 'Comedy|Drama|Family|Fantasy|Musical' 'Action|Adventure|Biography|Drama|History' 'Comedy|Crime|Family' 'Adventure|Drama|Thriller|War' 'Comedy|Drama|Horror|Sci-Fi' 'Adventure|Crime|Thriller' 'Mystery|Romance|Sci-Fi|Thriller' 'Fantasy|Mystery|Thriller' 'Family|Musical' 'Adventure|Crime|Drama|Mystery|Thriller' 'Drama|Fantasy|Music|Romance' 'Adventure|Drama|History|War' 'Family|Sci-Fi' 'Drama|History|Romance|Western' 'Adventure|Comedy|Music|Sci-Fi' 'Drama|Family|Musical' 'Action|Comedy|Drama|Music' 'Fantasy|Horror|Sci-Fi' 'Western' 'Comedy|Romance|Thriller' 'Biography|Crime|Drama|Romance' 'Adventure|Comedy|Drama|Romance|Sci-Fi' 'Drama|Music|Mystery|Romance' 'Action|Crime|Drama' 'Adventure|Biography|Drama|War' 'Action|Comedy|Drama' 'Adventure|Animation' 'Comedy|Drama|Horror|Romance' 'Action|Comedy|Drama|Western' 'Comedy|Crime|Drama|Mystery' 'Adventure|Animation|Fantasy|Horror|Sci-Fi' 'Action|Drama|Romance|Thriller' 'Biography|Comedy|Drama|Family|Romance' 'Action|Biography|Drama|History|Romance|War' 'Action|Animation|Fantasy|Horror|Mystery|Sci-Fi|Thriller' 'Action|Adventure|Animation|Drama|Fantasy|Sci-Fi' 'Horror|Musical|Sci-Fi' 'Biography|Drama|Family|Musical|Romance' 'Comedy|Crime|Drama|Romance|Thriller' 'Adventure|Drama|Fantasy|Mystery' 'Animation|Comedy|Drama|Romance' 'Comedy|Crime|Musical|Romance' 'Comedy|Crime|Musical|Mystery' 'Action|Animation|Sci-Fi' 'Drama|War|Western' 'Drama|Romance|Sci-Fi|Thriller' 'Animation|Biography|Drama|War' 'Adventure|Fantasy|Thriller' 'Documentary|Sport' 'Crime|Horror' 'Adventure|Biography|Drama|History' 'Action|Crime|Horror|Sci-Fi|Thriller' 'Comedy|Fantasy|Horror|Mystery' 'Action|Adventure|Animation|Comedy|Drama|Family|Fantasy|Thriller' 'Action|Adventure|Drama|Fantasy|Sci-Fi' 'Drama|Mystery|War' 'Action|Comedy|Crime|Drama|Romance|Thriller' 'Comedy|Drama|Musical' 'Mystery|Romance|Thriller' 'Adventure|Comedy|Drama|Family' 'Action|Adventure|Drama|Western' 'Musical|Romance' 'Documentary|Drama|War' 'Biography|Crime|Drama|Western' 'Comedy|Family|Fantasy|Musical' 'Crime|Drama|Musical|Romance|Thriller' 'Fantasy|Horror|Romance|Thriller' 'Adventure|Documentary|Short' 'Adventure|Crime|Drama|Thriller' 'Thriller|War' 'Action|Sport' 'Musical' 'Mystery|Western' 'Comedy|Drama|History|Romance' 'Comedy|Horror|Sci-Fi|Thriller' 'Drama|Horror|Mystery|Sci-Fi|Thriller' 'Comedy|Documentary' 'Adventure|Drama|Family|Fantasy|Sci-Fi' 'Adventure|Drama|Family|Romance|Western' 'Adventure|Horror' 'Comedy|Music|Sci-Fi' 'Biography|Crime|Drama|Romance|Thriller' 'Comedy|Crime|Drama|Mystery|Thriller' 'Biography|Crime|Drama|Mystery|Thriller' 'Crime|Horror|Music|Thriller' 'Crime|Documentary|War' 'Crime|Thriller|War' 'Comedy|Crime|Horror|Thriller' 'Animation|Comedy' 'Family' 'Comedy|Drama|Romance|War' 'Biography|Drama|Romance|Western' 'Drama|Musical' 'Adventure|Comedy|Western' 'Action|Drama|History|Thriller|War' 'Fantasy|Thriller' 'Drama|Horror|Mystery' 'Adventure|Drama|History|Thriller|War' 'Comedy|Documentary|Drama|Fantasy|Mystery|Sci-Fi' 'Crime|Drama|Fantasy|Romance' 'Action|Crime|Horror|Thriller' 'Comedy|Horror|Mystery' 'Drama|Family|History|Musical' 'Adventure|Biography|Drama|Romance' 'Adventure|War|Western' 'Biography|Comedy|Musical|Romance|Western' 'Adventure|Comedy|Musical|Romance' 'Comedy|Drama|Romance|Western' 'Action|Adventure|Comedy|Musical' 'Comedy|Drama|Fantasy|Horror' 'Action|Biography|Crime|Drama|Family|Fantasy' 'Action|Animation|Crime|Sci-Fi|Thriller' 'Action|Comedy|Horror|Thriller' 'Crime|Documentary|Drama' 'Biography|Comedy|Documentary' 'Comedy|Thriller' 'Comedy|Documentary|Music' 'Action|Adventure|Romance|Western' 'Crime|Drama|History|Romance' 'Family|Fantasy|Musical' 'Comedy|Drama|Horror' 'Drama|Family|Western' 'Comedy|Drama|Horror|Sci-Fi|Thriller' 'Drama|Horror|Romance' 'Adventure|Crime|Drama' 'Action|Adventure|Crime|Drama' 'Adventure|Family|Sport' 'Romance' 'Action|Adventure|Animation|Comedy|Sci-Fi' 'Drama|Fantasy|Romance|War' 'Documentary|History|Sport' 'Action|Drama|Horror|Thriller' 'Comedy|Crime|Drama|Sci-Fi' 'Comedy|Family|Musical|Romance|Short' 'Comedy|Documentary|War' 'Comedy|Drama|Mystery|Romance|Thriller' 'Action|Comedy|Horror|Sci-Fi' 'Adventure|Drama|Romance|Western' 'Animation|Comedy|Drama' 'Adventure|Documentary|Drama|Sport' 'Crime|Documentary' 'Animation|Biography|Documentary|Drama|History|War' 'Documentary|War' 'Documentary|History' 'Biography|Documentary|History' 'Action|Adventure|Comedy|Drama|Music|Sci-Fi' 'Biography|Comedy|Drama|Music' 'Animation|Comedy|Family|Romance' 'Horror|Romance|Sci-Fi' 'Action|Comedy|Fantasy|Horror' 'Crime|Drama|Film-Noir|Mystery|Thriller' 'Comedy|Fantasy|Musical|Sci-Fi' 'Action|Adventure|History|Western' 'Documentary|Drama|History|News' 'Biography|Crime|Documentary|History|Thriller' 'Crime|Drama|Film-Noir' 'Film-Noir|Mystery|Romance|Thriller' 'Comedy|Crime|Sci-Fi|Thriller' 'Adventure|Comedy|Horror' 'Action|Crime|Drama|Mystery' 'Horror|Romance|Thriller' 'Drama|Film-Noir|Mystery|Thriller' 'Drama|Film-Noir' 'Crime|Film-Noir|Thriller' 'Action|Adventure|Romance|War' 'Action|Horror|Mystery|Thriller' 'Adventure|Comedy|Sport' 'Comedy|Horror|Musical' 'Adventure|Comedy|History' 'Action|Drama|Romance|War' 'Biography|Documentary|Music' 'Comedy|Fantasy|Mystery' 'Biography|Crime|Documentary|History' 'Adventure|Biography|Documentary|Drama' 'Action|Adventure|Comedy|Fantasy|Sci-Fi' 'Drama|Musical|Sci-Fi' 'Documentary|News' 'Comedy|Fantasy|Thriller' 'Animation|Drama|Family' 'Drama|Fantasy|Sci-Fi' 'Action|Comedy|Drama|Sci-Fi' 'Action|Adventure|Drama|War' 'Horror|Sci-Fi|Short|Thriller' 'Action|Adventure|Animation|Comedy|Fantasy|Sci-Fi' 'Thriller|Western' 'Documentary|Drama|Sport' 'Documentary|History|Music' 'Biography|Documentary|Drama' 'Adventure|Family|Romance' 'Adventure|Biography|Drama|Horror|Thriller' 'Documentary|Family|Music' 'Biography|Documentary|Sport' 'History' 'Action|Romance|Sport' 'Horror|Musical' 'Comedy|Mystery|Thriller' 'Action|Biography|Documentary|Sport' 'Comedy|Fantasy|Horror|Musical' 'Drama|Fantasy|Sci-Fi|Thriller' 'Biography|Documentary' 'Animation|Drama' 'Action|Fantasy|Horror|Mystery|Thriller' 'Action|Comedy|Sci-Fi|Sport' 'Comedy|Crime|Drama|Horror|Mystery|Thriller' 'Action|Adventure|Mystery|Romance|Thriller' 'Animation|Comedy|Drama|Fantasy|Sci-Fi' 'Action|Drama|Fantasy|Sci-Fi' 'Comedy|Short' 'Adventure|Drama|Fantasy|Thriller|Western' 'Adventure|Horror|Sci-Fi' 'Comedy|Drama|History|Musical|Romance' 'Comedy|Horror|Mystery|Thriller' 'Drama|Music|Mystery|Romance|Sci-Fi' 'Adventure|Documentary' 'Documentary|Family' 'Comedy|Crime|Drama|Horror|Thriller' 'Comedy|Documentary|Drama' 'Crime|Drama|Horror' 'Comedy|Crime|Horror'] actor_1_name: - Total de datos únicos: 2098 - Valores: ['CCH Pounder' 'Johnny Depp' 'Christoph Waltz' ... 'Natalie Zea' 'Eva Boehnke' 'John August'] movie_title: - Total de datos únicos: 4917 - Valores: ['Avatar\xa0' "Pirates of the Caribbean: At World's End\xa0" 'Spectre\xa0' ... 'A Plague So Pleasant\xa0' 'Shanghai Calling\xa0' 'My Date with Drew\xa0'] num_voted_users: - Total de datos únicos: 4826 - Valores: [886204 471220 275868 ... 73839 1255 4285] cast_total_facebook_likes: - Total de datos únicos: 3978 - Valores: [ 4834 48350 11700 ... 93 690 2386] actor_3_name: - Total de datos únicos: 3522 - Valores: ['Wes Studi' 'Jack Davenport' 'Stephanie Sigman' ... 'David Chandler' 'Eliza Coupe' 'Jon Gunn'] facenumber_in_poster: - Total de datos únicos: 20 - Valores: [ 0. 1. 4. 3. 2. 6. 7. 5. 8. nan 10. 15. 9. 11. 12. 31. 14. 19. 13. 43.] plot_keywords: - Total de datos únicos: 4761 - Valores: ['avatar|future|marine|native|paraplegic' 'goddess|marriage ceremony|marriage proposal|pirate|singapore' 'bomb|espionage|sequel|spy|terrorist' ... 'fraud|postal worker|prison|theft|trial' 'cult|fbi|hideout|prison escape|serial killer' 'actress name in title|crush|date|four word title|video camera'] movie_imdb_link: - Total de datos únicos: 4919 - Valores: ['http://www.imdb.com/title/tt0499549/?ref_=fn_tt_tt_1' 'http://www.imdb.com/title/tt0449088/?ref_=fn_tt_tt_1' 'http://www.imdb.com/title/tt2379713/?ref_=fn_tt_tt_1' ... 'http://www.imdb.com/title/tt2107644/?ref_=fn_tt_tt_1' 'http://www.imdb.com/title/tt2070597/?ref_=fn_tt_tt_1' 'http://www.imdb.com/title/tt0378407/?ref_=fn_tt_tt_1'] num_user_for_reviews: - Total de datos únicos: 955 - Valores: [3.054e+03 1.238e+03 9.940e+02 2.701e+03 nan 7.380e+02 1.902e+03 3.870e+02 1.117e+03 9.730e+02 3.018e+03 2.367e+03 1.243e+03 1.832e+03 7.110e+02 2.536e+03 4.380e+02 1.722e+03 4.840e+02 3.410e+02 8.020e+02 1.225e+03 5.460e+02 9.510e+02 6.660e+02 2.618e+03 2.528e+03 1.022e+03 7.510e+02 1.290e+03 1.498e+03 1.303e+03 1.187e+03 7.360e+02 1.912e+03 2.650e+02 1.439e+03 9.180e+02 5.110e+02 1.067e+03 6.650e+02 2.830e+02 5.500e+02 7.330e+02 9.740e+02 6.570e+02 9.950e+02 7.520e+02 1.171e+03 2.050e+02 7.530e+02 4.530e+02 1.106e+03 8.990e+02 2.054e+03 3.450e+02 4.280e+02 4.320e+02 1.043e+03 2.210e+02 1.055e+03 2.490e+02 7.200e+02 2.390e+02 1.463e+03 6.220e+02 4.667e+03 7.040e+02 1.870e+02 6.780e+02 6.480e+02 5.010e+02 9.710e+02 2.570e+02 7.410e+02 3.090e+02 5.340e+02 7.730e+02 3.980e+02 7.230e+02 7.100e+02 6.340e+02 6.200e+02 1.500e+01 3.240e+02 7.420e+02 1.730e+02 4.970e+02 4.330e+02 4.440e+02 5.200e+02 4.920e+02 1.676e+03 1.097e+03 2.725e+03 2.803e+03 1.300e+01 1.367e+03 9.880e+02 8.220e+02 6.980e+02 3.830e+02 2.380e+02 6.290e+02 1.310e+02 3.260e+02 7.810e+02 8.670e+02 2.270e+02 1.999e+03 1.782e+03 1.390e+03 1.108e+03 1.896e+03 5.900e+02 1.413e+03 1.361e+03 6.260e+02 2.685e+03 1.190e+02 2.090e+02 6.410e+02 2.121e+03 9.040e+02 2.789e+03 5.320e+02 1.588e+03 4.350e+02 1.780e+02 9.000e+01 2.530e+02 4.790e+02 4.400e+02 2.060e+02 1.382e+03 8.710e+02 4.340e+02 1.120e+02 1.220e+02 1.860e+02 1.300e+02 1.694e+03 1.540e+02 1.185e+03 1.211e+03 6.060e+02 5.050e+02 1.450e+02 5.120e+02 1.740e+02 2.580e+02 9.280e+02 1.559e+03 2.012e+03 3.430e+02 2.730e+02 3.880e+02 1.229e+03 2.870e+02 1.445e+03 2.880e+02 4.630e+02 7.880e+02 6.790e+02 6.830e+02 6.840e+02 3.290e+02 7.900e+01 6.430e+02 7.400e+01 1.060e+02 1.188e+03 3.370e+02 1.180e+02 8.200e+02 3.600e+02 5.490e+02 7.060e+02 2.140e+02 2.741e+03 1.370e+02 5.140e+02 1.240e+03 4.470e+02 1.504e+03 4.500e+02 7.440e+02 2.410e+02 2.000e+00 1.260e+02 1.571e+03 2.100e+02 2.113e+03 5.910e+02 1.966e+03 9.900e+01 3.660e+02 4.120e+02 6.370e+02 3.910e+02 5.040e+02 1.018e+03 4.820e+02 1.159e+03 1.426e+03 7.790e+02 4.360e+02 7.550e+02 6.810e+02 2.970e+02 5.540e+02 2.326e+03 6.900e+01 8.140e+02 6.300e+02 4.140e+02 1.960e+02 3.480e+02 8.920e+02 3.286e+03 3.516e+03 5.930e+02 5.330e+02 3.597e+03 1.950e+02 3.600e+01 4.540e+02 1.340e+02 4.910e+02 1.885e+03 2.770e+02 1.150e+02 6.950e+02 4.990e+02 3.280e+02 1.144e+03 6.270e+02 7.980e+02 7.990e+02 1.210e+02 4.430e+02 9.700e+01 5.230e+02 1.530e+02 8.800e+01 1.440e+02 4.260e+02 5.900e+01 2.480e+02 7.820e+02 5.060e+03 1.910e+02 3.860e+02 4.560e+02 7.890e+02 9.420e+02 1.790e+02 1.023e+03 8.900e+01 4.980e+02 2.368e+03 1.331e+03 8.580e+02 2.301e+03 1.368e+03 9.830e+02 5.850e+02 4.580e+02 3.510e+02 2.850e+02 1.520e+02 3.160e+02 1.193e+03 2.300e+02 4.740e+02 6.920e+02 1.690e+03 1.130e+02 3.740e+02 5.240e+02 1.138e+03 5.390e+02 1.049e+03 8.280e+02 1.600e+02 6.600e+02 2.690e+02 2.170e+02 2.240e+02 1.630e+02 1.610e+02 1.640e+02 2.610e+02 1.550e+02 2.630e+02 8.900e+02 1.166e+03 4.070e+02 1.103e+03 1.770e+02 2.930e+02 1.390e+02 3.220e+02 8.660e+02 3.189e+03 2.417e+03 8.240e+02 4.620e+02 1.236e+03 6.460e+02 1.250e+02 5.030e+02 2.670e+02 8.150e+02 1.690e+02 4.190e+02 2.890e+02 5.150e+02 3.940e+02 1.560e+02 1.320e+02 2.220e+02 5.770e+02 6.320e+02 3.460e+02 6.210e+02 1.000e+00 5.300e+01 3.800e+02 3.420e+02 2.153e+03 6.110e+02 6.280e+02 4.110e+02 2.040e+02 1.840e+02 3.610e+02 9.490e+02 3.620e+02 1.810e+02 1.140e+02 2.780e+02 5.290e+02 2.400e+01 4.550e+02 3.950e+02 1.051e+03 5.480e+02 8.060e+02 3.930e+02 8.450e+02 2.540e+02 1.680e+02 3.990e+02 2.700e+02 2.760e+02 7.700e+01 1.940e+02 4.830e+02 4.150e+02 1.000e+02 7.130e+02 6.620e+02 2.550e+02 3.780e+02 7.800e+01 7.030e+02 3.390e+02 1.460e+02 2.080e+02 5.220e+02 4.660e+02 1.710e+02 5.210e+02 5.880e+02 6.850e+02 2.110e+02 6.300e+01 4.880e+02 5.890e+02 6.040e+02 1.959e+03 4.020e+02 6.100e+02 2.500e+02 3.670e+02 1.020e+02 4.930e+02 5.600e+01 9.640e+02 1.240e+02 5.350e+02 1.160e+02 1.009e+03 2.030e+02 3.180e+02 6.740e+02 5.600e+02 6.400e+01 8.050e+02 1.230e+02 4.240e+02 1.620e+02 7.010e+02 2.160e+02 1.270e+02 3.700e+02 3.630e+02 4.000e+01 7.100e+01 6.150e+02 5.430e+02 2.440e+02 6.190e+02 2.370e+02 4.050e+02 3.000e+00 1.820e+02 2.520e+02 3.150e+02 1.010e+02 6.600e+01 1.308e+03 2.320e+02 4.570e+02 3.760e+02 1.170e+02 4.200e+01 3.010e+02 3.080e+02 6.560e+02 1.330e+02 8.500e+01 7.640e+02 2.640e+02 1.970e+02 7.860e+02 2.840e+02 8.560e+02 5.520e+02 3.850e+02 1.206e+03 1.401e+03 4.010e+02 2.310e+02 3.110e+02 5.960e+02 9.400e+01 5.700e+01 3.790e+02 6.700e+01 3.230e+02 1.200e+02 2.290e+02 1.800e+02 3.770e+02 3.730e+02 1.344e+03 2.750e+02 2.740e+02 1.380e+02 4.370e+02 1.065e+03 7.630e+02 2.003e+03 9.800e+01 2.335e+03 5.840e+02 7.370e+02 1.580e+02 1.527e+03 3.400e+02 1.100e+02 1.248e+03 1.040e+03 1.283e+03 8.000e+02 8.160e+02 4.030e+02 2.450e+02 4.230e+02 2.810e+02 1.410e+02 1.890e+02 2.710e+02 3.560e+02 1.570e+02 5.400e+01 2.070e+02 6.180e+02 3.130e+02 2.790e+02 2.020e+02 5.680e+02 4.450e+02 2.960e+02 6.200e+01 1.720e+02 1.430e+02 8.700e+01 2.277e+03 4.670e+02 3.646e+03 5.560e+02 5.640e+02 3.100e+02 1.500e+03 2.130e+02 2.968e+03 1.750e+02 2.800e+01 2.900e+02 7.950e+02 8.950e+02 8.600e+01 6.020e+02 2.460e+02 8.200e+01 2.073e+03 3.200e+02 1.377e+03 2.510e+02 1.127e+03 8.490e+02 4.160e+02 8.770e+02 3.530e+02 2.350e+02 1.280e+02 3.020e+02 4.520e+02 8.360e+02 4.480e+02 2.230e+02 1.264e+03 6.900e+02 8.300e+01 5.970e+02 1.760e+02 3.470e+02 4.100e+01 2.560e+02 3.380e+02 4.710e+02 5.690e+02 2.590e+02 7.300e+01 7.200e+01 8.420e+02 1.500e+02 1.650e+02 2.330e+02 9.100e+01 8.570e+02 2.430e+02 6.440e+02 2.470e+02 2.820e+02 1.090e+02 1.880e+02 3.680e+02 9.000e+00 1.830e+02 1.058e+03 9.160e+02 4.100e+02 4.310e+02 1.980e+02 1.470e+02 3.310e+02 2.200e+02 2.250e+02 1.398e+03 5.070e+02 3.720e+02 1.030e+02 3.320e+02 5.270e+02 9.200e+01 2.420e+02 6.000e+02 9.500e+01 2.105e+03 3.580e+02 9.350e+02 1.490e+02 1.080e+02 7.760e+02 5.720e+02 3.210e+02 2.047e+03 8.400e+01 3.900e+01 1.420e+02 5.590e+02 5.990e+02 5.450e+02 4.950e+02 4.180e+02 2.319e+03 1.900e+02 4.290e+02 6.670e+02 6.400e+02 6.120e+02 1.448e+03 4.810e+02 9.190e+02 9.450e+02 6.230e+02 4.700e+02 3.750e+02 3.570e+02 4.060e+02 1.990e+02 7.910e+02 2.920e+02 3.120e+02 3.840e+02 2.042e+03 1.920e+02 3.040e+02 6.250e+02 2.190e+02 3.060e+02 5.060e+02 1.740e+03 1.480e+02 6.720e+02 1.400e+02 2.260e+02 3.300e+01 6.500e+01 3.920e+02 2.860e+02 2.600e+02 2.200e+01 4.700e+01 4.200e+02 3.900e+02 7.000e+01 9.300e+01 2.940e+02 4.770e+02 3.970e+02 8.500e+02 3.200e+01 1.850e+02 5.370e+02 3.070e+02 5.500e+01 2.360e+02 6.510e+02 1.070e+02 2.900e+01 1.600e+01 4.610e+02 2.120e+02 2.280e+02 6.330e+02 6.090e+02 5.820e+02 3.500e+02 1.004e+03 3.170e+02 5.410e+02 6.100e+01 1.053e+03 3.550e+02 3.050e+02 8.000e+01 8.100e+01 5.310e+02 4.090e+02 3.490e+02 1.350e+02 5.020e+02 3.030e+02 7.500e+01 2.680e+02 4.600e+02 3.960e+02 8.410e+02 8.350e+02 1.800e+01 5.000e+00 6.580e+02 3.360e+02 3.270e+02 1.360e+03 1.660e+02 6.960e+02 7.340e+02 6.540e+02 2.000e+02 5.800e+01 3.820e+02 1.732e+03 4.220e+02 2.150e+02 4.420e+02 3.440e+02 5.440e+02 1.100e+03 4.400e+01 3.400e+01 3.710e+02 9.980e+02 2.910e+02 3.340e+02 5.180e+02 5.870e+02 4.300e+01 1.670e+02 7.540e+02 1.050e+02 6.000e+01 1.314e+03 1.110e+02 2.500e+01 8.010e+02 2.600e+01 2.340e+02 1.594e+03 5.000e+01 6.380e+02 9.110e+02 4.500e+01 2.660e+02 4.750e+02 1.437e+03 1.535e+03 2.100e+01 2.180e+02 7.840e+02 8.170e+02 1.360e+02 3.000e+01 7.240e+02 6.310e+02 6.680e+02 6.770e+02 2.620e+02 4.000e+02 1.100e+01 7.610e+02 5.760e+02 9.030e+02 1.510e+02 6.800e+01 3.350e+02 3.250e+02 2.980e+02 9.750e+02 2.800e+02 3.100e+01 1.290e+02 1.590e+02 7.600e+01 9.150e+02 3.190e+02 1.033e+03 6.470e+02 4.410e+02 4.850e+02 6.690e+02 5.710e+02 1.125e+03 3.640e+02 4.900e+01 5.100e+02 1.017e+03 1.080e+03 1.262e+03 1.111e+03 5.090e+02 3.000e+02 5.810e+02 6.910e+02 9.860e+02 1.040e+02 2.010e+02 8.300e+02 3.540e+02 6.710e+02 3.800e+01 1.900e+01 3.650e+02 2.700e+01 1.700e+01 2.300e+01 7.000e+00 1.400e+01 9.600e+01 5.100e+01 7.220e+02 1.700e+02 1.057e+03 9.620e+02 7.140e+02 1.168e+03 6.240e+02 5.200e+01 4.940e+02 6.730e+02 4.720e+02 8.620e+02 2.814e+03 1.273e+03 8.850e+02 2.192e+03 1.518e+03 4.390e+02 9.890e+02 8.510e+02 1.107e+03 3.500e+01 4.144e+03 1.200e+01 8.880e+02 3.700e+01 5.130e+02 2.000e+01 4.800e+01 8.000e+00 4.040e+02 3.300e+02 5.530e+02 9.000e+02 7.480e+02 5.800e+02 3.810e+02 5.170e+02 7.710e+02 8.070e+02 4.600e+01 6.130e+02 3.890e+02 9.080e+02 7.000e+02 1.514e+03 7.600e+02 5.920e+02 6.000e+00 3.590e+02 2.990e+02 5.700e+02 1.137e+03 8.690e+02 4.780e+02 1.320e+03 8.090e+02 5.360e+02 9.020e+02 4.000e+00 1.198e+03 4.250e+02 1.101e+03 1.930e+02 5.550e+02 1.109e+03 1.076e+03 1.191e+03 8.550e+02 6.450e+02 4.690e+02 5.830e+02 1.083e+03 6.870e+02 3.520e+02 1.641e+03 2.715e+03 6.360e+02 6.080e+02 4.210e+02 4.680e+02 1.026e+03 2.400e+02 6.140e+02 7.180e+02 7.350e+02 8.760e+02 3.690e+02 1.028e+03 1.768e+03 4.130e+02 7.470e+02 2.254e+03 3.140e+02 6.160e+02 1.140e+03 6.500e+02 1.066e+03 8.400e+02 5.780e+02 6.050e+02 1.000e+01 7.260e+02 1.470e+03 7.490e+02 1.736e+03 6.820e+02 9.850e+02 1.015e+03 9.440e+02 5.400e+02 1.420e+03 1.624e+03 5.510e+02 1.110e+03 8.260e+02 2.195e+03 8.890e+02 1.441e+03 1.061e+03 4.300e+02 8.810e+02 2.238e+03 5.260e+02 9.220e+02 4.080e+02 1.416e+03 1.182e+03 5.470e+02 2.720e+02 6.640e+02 8.640e+02 2.067e+03 7.560e+02 5.610e+02 8.590e+02 5.650e+02 1.516e+03 1.916e+03 2.110e+03 1.848e+03 3.330e+02 4.860e+02 4.270e+02 7.310e+02 9.780e+02 8.390e+02 7.090e+02 1.509e+03 9.310e+02 7.800e+02 1.123e+03 5.420e+02 3.400e+03 4.510e+02 1.473e+03 1.189e+03 7.400e+02 5.000e+02 5.860e+02] language: - Total de datos únicos: 47 - Valores: ['English' nan 'Japanese' 'French' 'Mandarin' 'Aboriginal' 'Spanish' 'Filipino' 'Hindi' 'Russian' 'Maya' 'Kazakh' 'Telugu' 'Cantonese' 'Icelandic' 'German' 'Aramaic' 'Italian' 'Dutch' 'Dari' 'Hebrew' 'Chinese' 'Mongolian' 'Swedish' 'Korean' 'Thai' 'Polish' 'Bosnian' 'Hungarian' 'Portuguese' 'Danish' 'Arabic' 'Norwegian' 'Czech' 'Kannada' 'Zulu' 'Panjabi' 'Tamil' 'Dzongkha' 'Vietnamese' 'Indonesian' 'Urdu' 'Romanian' 'Persian' 'Slovenian' 'Greek' 'Swahili'] country: - Total de datos únicos: 66 - Valores: ['USA' 'UK' nan 'New Zealand' 'Canada' 'Australia' 'Belgium' 'Japan' 'Germany' 'China' 'France' 'New Line' 'Mexico' 'Spain' 'Hong Kong' 'Czech Republic' 'India' 'Soviet Union' 'South Korea' 'Peru' 'Italy' 'Russia' 'Aruba' 'Denmark' 'Libya' 'Ireland' 'South Africa' 'Iceland' 'Switzerland' 'Romania' 'West Germany' 'Chile' 'Netherlands' 'Hungary' 'Panama' 'Greece' 'Sweden' 'Norway' 'Taiwan' 'Official site' 'Cambodia' 'Thailand' 'Slovakia' 'Bulgaria' 'Iran' 'Poland' 'Georgia' 'Turkey' 'Nigeria' 'Brazil' 'Finland' 'Bahamas' 'Argentina' 'Colombia' 'Israel' 'Egypt' 'Kyrgyzstan' 'Indonesia' 'Pakistan' 'Slovenia' 'Afghanistan' 'Dominican Republic' 'Cameroon' 'United Arab Emirates' 'Kenya' 'Philippines'] content_rating: - Total de datos únicos: 19 - Valores: ['PG-13' nan 'PG' 'G' 'R' 'TV-14' 'TV-PG' 'TV-MA' 'TV-G' 'Not Rated' 'Unrated' 'Approved' 'TV-Y' 'NC-17' 'X' 'TV-Y7' 'GP' 'Passed' 'M'] budget: - Total de datos únicos: 440 - Valores: [2.3700000e+08 3.0000000e+08 2.4500000e+08 2.5000000e+08 nan 2.6370000e+08 2.5800000e+08 2.6000000e+08 2.0900000e+08 2.0000000e+08 2.2500000e+08 2.1500000e+08 2.2000000e+08 2.3000000e+08 1.8000000e+08 2.0700000e+08 1.5000000e+08 2.1000000e+08 1.7000000e+08 1.9000000e+08 1.9500000e+08 1.0500000e+08 1.8500000e+08 1.4000000e+08 1.7600000e+08 1.7800000e+08 1.7500000e+08 1.4500000e+08 1.6500000e+08 1.6000000e+08 3.8000000e+07 1.5500000e+08 1.0000000e+08 1.4900000e+08 1.4200000e+08 1.4400000e+08 1.3900000e+08 1.3500000e+08 1.3000000e+08 1.3700000e+08 1.2000000e+08 1.5000000e+06 1.3200000e+08 1.1000000e+08 1.2500000e+08 1.2750000e+08 1.2700000e+08 1.0300000e+08 6.5000000e+07 8.5000000e+07 1.2300000e+08 1.1500000e+08 1.1700000e+08 1.1300000e+08 7.8000000e+07 1.1600000e+08 1.1200000e+08 9.3000000e+07 1.0700000e+08 1.0900000e+08 1.3300000e+08 1.0800000e+08 1.2600000e+08 9.0000000e+07 1.0200000e+08 9.2000000e+07 8.3000000e+07 8.0000000e+07 8.4000000e+07 9.9000000e+07 1.0000000e+07 9.8000000e+07 9.4000000e+07 9.5000000e+07 7.5000000e+07 8.8000000e+07 6.8000000e+07 8.6000000e+07 2.0000000e+07 8.7000000e+07 7.0000000e+07 6.0000000e+07 3.5000000e+07 8.0000000e+06 8.2000000e+07 8.1000000e+07 7.9000000e+07 4.4000000e+07 4.0000000e+07 5.2000000e+07 5.8000000e+07 4.5000000e+07 7.6000000e+07 8.1200000e+07 7.3000000e+07 5.0000000e+07 5.3000000e+07 5.5000000e+07 7.4000000e+07 6.9000000e+07 7.2000000e+07 5.9660000e+07 7.1500000e+07 6.6000000e+07 6.9500000e+07 3.6000000e+07 5.9000000e+07 6.3000000e+07 6.2000000e+07 6.1000000e+07 5.0100000e+07 1.6900000e+07 4.3000000e+07 6.4000000e+07 4.2000000e+07 4.8000000e+07 3.0000000e+07 6.8005000e+07 5.8800000e+07 3.0000000e+06 5.7000000e+07 5.6000000e+07 5.4000000e+07 1.4000000e+06 7.1000000e+07 4.7000000e+07 2.0000000e+06 4.6000000e+07 5.2500000e+07 5.1000000e+07 5.0200000e+07 2.5000000e+07 3.9000000e+08 4.9900000e+07 2.2000000e+07 1.8000000e+07 4.9000000e+07 1.4000000e+07 1.0000000e+06 2.5000000e+06 2.6000000e+07 4.4500000e+07 2.6000000e+06 3.1115000e+07 3.2000000e+07 3.1000000e+07 2.7000000e+07 4.1000000e+07 3.4000000e+07 5.0000000e+05 7.7000000e+07 2.4000000e+07 3.3000000e+07 3.9200000e+07 2.3000000e+07 1.8026148e+07 3.9000000e+07 5.5363200e+08 3.8600000e+07 1.5000000e+07 3.7000000e+07 2.9500000e+07 3.5200000e+07 2.9000000e+07 1.8000000e+06 1.0700000e+07 1.9000000e+07 3.2500000e+07 2.8000000e+07 3.1500000e+07 6.5000000e+06 3.0250000e+07 3.4200000e+07 1.7000000e+07 2.7800000e+07 2.1000000e+07 1.2000000e+07 2.7500000e+07 1.6000000e+07 1.3500000e+07 2.5530000e+07 2.5100000e+07 2.8000000e+06 2.5500000e+07 2.1150000e+07 1.3000000e+07 8.2000000e+06 2.3600000e+07 1.2500000e+07 1.9430000e+07 1.1000000e+07 2.2700000e+07 2.2500000e+07 2.3500000e+07 2.1500000e+07 9.0000000e+06 1.9400870e+07 1.9800000e+07 8.0694700e+05 1.9500000e+07 8.7000000e+06 2.4000000e+09 2.1275199e+09 1.3000000e+04 2.7220000e+07 1.9400000e+07 1.8500000e+07 2.7000000e+06 1.1350000e+07 3.5000000e+06 1.7900000e+07 1.7500000e+07 3.0000000e+05 4.0000000e+06 1.6500000e+07 1.6800000e+07 1.6400000e+07 1.5600000e+07 1.7700000e+07 1.5500000e+07 1.5300000e+07 9.8000000e+06 7.0000000e+06 1.1500000e+07 6.0000000e+06 1.4600000e+07 1.4800000e+07 1.4500000e+07 1.4400000e+07 1.4200000e+07 1.5800000e+07 8.5000000e+06 1.3400000e+07 1.3200000e+07 8.4950000e+06 1.2620000e+07 3.6600000e+06 1.2800000e+07 1.0500000e+07 9.6000000e+06 5.0000000e+06 1.2215500e+10 9.2000000e+06 2.5000000e+09 7.5000000e+06 1.1900000e+07 1.0800000e+07 7.0000000e+08 1.0600000e+07 1.0818775e+07 1.3800000e+07 1.2305523e+07 1.2600000e+07 6.4000000e+06 6.2000000e+06 9.5000000e+06 8.9000000e+06 9.4000000e+06 9.3000000e+06 6.0000000e+08 7.4000000e+06 7.2176000e+06 8.3532000e+04 4.0000000e+08 1.1400000e+07 8.8000000e+06 8.6000000e+06 7.6230000e+06 8.3000000e+06 8.5500000e+06 7.2000000e+06 1.1000000e+09 7.9000000e+06 7.7000000e+06 4.5000000e+06 7.3000000e+06 6.6000000e+06 2.3000000e+06 3.5000000e+05 4.8250000e+06 6.9000000e+06 6.8000000e+06 4.8000000e+06 6.2440870e+06 7.8400000e+06 5.9520000e+06 5.3000000e+06 6.7000000e+06 3.5001590e+06 5.6000000e+06 5.5000000e+06 3.8500000e+06 5.2500000e+06 3.0300000e+07 5.1000000e+06 4.9000000e+06 3.3000000e+06 8.0000000e+05 2.2000000e+06 3.2090000e+06 8.4450000e+07 8.9000000e+05 4.7000000e+06 4.6387830e+06 4.6000000e+06 4.2000000e+09 4.4000000e+06 4.2000000e+06 1.1400000e+05 3.6000000e+06 3.2000000e+06 3.4000000e+06 1.3000000e+06 6.5000000e+05 3.9500000e+06 3.8000000e+06 3.9770000e+06 3.7687850e+06 3.7000000e+06 3.7169460e+06 1.9900000e+07 3.4400000e+06 4.3000000e+06 3.1800000e+06 4.4903750e+06 1.9000000e+06 2.9000000e+06 2.8838480e+06 2.6865850e+06 2.6500000e+06 2.6270000e+06 2.5408000e+06 3.4000000e+04 8.4000000e+06 2.4000000e+06 2.3610000e+06 2.4500000e+06 2.2954290e+06 2.2800000e+06 2.1600000e+06 1.2000000e+06 2.1000000e+06 1.6140000e+06 1.4000000e+04 1.0000000e+05 1.2500000e+06 1.9500000e+06 1.7500000e+06 1.7000000e+06 1.6447360e+06 1.6500000e+06 1.6000000e+06 1.1000000e+06 1.6963770e+06 1.4550000e+06 3.1500000e+06 1.3778000e+06 9.6000000e+05 1.5920000e+06 1.2880000e+06 4.2700000e+05 1.4200000e+06 6.9539300e+05 9.5000000e+05 1.0000000e+09 9.0000000e+05 9.8900000e+05 9.1300000e+05 9.1000000e+05 9.3000000e+05 5.9000000e+05 8.5000000e+05 8.2500000e+05 9.9000000e+05 6.0000000e+05 7.8000000e+05 7.7700000e+05 7.5000000e+05 7.0000000e+05 4.0000000e+05 6.2500000e+05 6.0900000e+05 6.0000000e+04 5.6000000e+05 5.5000000e+05 4.6000000e+04 1.5000000e+05 4.7500000e+05 4.5000000e+05 4.3900000e+05 2.2500000e+05 1.0661670e+06 1.5000000e+04 2.2957500e+05 2.1800000e+02 3.8590700e+05 3.7500000e+05 3.7900000e+05 1.7502110e+06 3.2500000e+05 3.1200000e+05 2.0000000e+05 1.6000000e+05 2.5000000e+05 2.7000000e+05 2.9000000e+05 3.6500000e+05 2.4500000e+05 2.4000000e+05 2.1000000e+05 1.8000000e+05 1.2000000e+05 1.7500000e+05 1.6800000e+05 1.2500000e+05 1.0300000e+05 2.0000000e+04 4.0000000e+04 7.0000000e+04 7.5000000e+04 6.5000000e+04 6.2000000e+04 2.5000000e+04 5.0000000e+04 4.2000000e+04 4.5000000e+04 3.0000000e+04 2.3000000e+05 2.7000000e+04 2.4000000e+04 2.3000000e+04 2.2000000e+04 1.7350000e+04 1.0000000e+04 4.5000000e+03 7.0000000e+03 3.2500000e+03 9.0000000e+03 1.4000000e+03 1.1000000e+03] title_year: - Total de datos únicos: 92 - Valores: [2009. 2007. 2015. 2012. nan 2010. 2016. 2006. 2008. 2013. 2011. 2014. 2005. 1997. 2004. 1999. 1995. 2003. 2001. 2002. 1998. 2000. 1990. 1991. 1994. 1996. 1982. 1993. 1979. 1992. 1989. 1984. 1988. 1978. 1962. 1980. 1972. 1981. 1968. 1985. 1940. 1963. 1987. 1986. 1973. 1983. 1976. 1977. 1970. 1971. 1969. 1960. 1965. 1964. 1927. 1974. 1937. 1975. 1967. 1951. 1961. 1946. 1953. 1954. 1959. 1932. 1947. 1956. 1945. 1952. 1930. 1966. 1939. 1950. 1948. 1958. 1957. 1943. 1944. 1938. 1949. 1936. 1941. 1955. 1942. 1929. 1935. 1933. 1916. 1934. 1925. 1920.] actor_2_facebook_likes: - Total de datos únicos: 918 - Valores: [9.36e+02 5.00e+03 3.93e+02 2.30e+04 1.20e+01 6.32e+02 1.10e+04 5.53e+02 2.10e+04 4.00e+03 1.00e+04 4.12e+02 2.00e+03 3.00e+03 2.16e+02 8.16e+02 9.72e+02 8.82e+02 6.00e+03 9.19e+02 1.40e+04 1.90e+04 5.63e+02 2.50e+04 8.08e+02 7.79e+02 5.81e+02 9.56e+02 1.50e+04 3.68e+02 1.00e+03 2.20e+04 9.81e+02 5.57e+02 5.09e+02 5.67e+02 9.68e+02 8.29e+02 1.50e+02 1.19e+02 7.29e+02 2.68e+02 4.68e+02 1.90e+02 1.30e+04 8.48e+02 9.73e+02 1.20e+04 1.60e+04 3.36e+02 8.54e+02 6.00e+01 7.67e+02 1.70e+04 5.25e+02 2.25e+02 6.38e+02 7.19e+02 9.31e+02 7.26e+02 8.12e+02 9.53e+02 2.84e+02 2.70e+04 1.06e+02 4.18e+02 7.95e+02 7.16e+02 8.20e+01 9.61e+02 5.99e+02 9.00e+03 8.51e+02 2.69e+02 5.23e+02 1.98e+02 2.00e+04 7.45e+02 7.59e+02 6.07e+02 8.97e+02 4.90e+02 8.52e+02 7.56e+02 5.51e+02 5.62e+02 5.48e+02 7.70e+02 7.66e+02 3.70e+02 3.15e+02 5.50e+02 8.93e+02 9.34e+02 7.80e+02 3.00e+02 5.36e+02 1.72e+02 3.21e+02 4.00e+02 8.81e+02 4.11e+02 9.67e+02 8.98e+02 4.42e+02 7.02e+02 8.71e+02 9.43e+02 5.58e+02 5.70e+02 6.87e+02 5.74e+02 2.37e+02 5.05e+02 9.79e+02 3.08e+02 3.72e+02 8.26e+02 8.90e+02 7.22e+02 7.94e+02 3.58e+02 7.01e+02 3.65e+02 7.99e+02 8.50e+02 1.07e+02 9.92e+02 2.76e+02 8.36e+02 8.09e+02 2.30e+01 2.93e+02 8.33e+02 3.60e+02 8.00e+03 3.92e+02 5.54e+02 1.13e+02 5.13e+02 9.29e+02 7.10e+01 6.35e+02 7.43e+02 8.86e+02 5.78e+02 8.01e+02 nan 6.31e+02 9.33e+02 7.98e+02 6.58e+02 6.04e+02 7.82e+02 4.64e+02 7.13e+02 7.73e+02 7.00e+02 4.52e+02 6.25e+02 7.62e+02 1.51e+02 7.10e+02 4.75e+02 6.53e+02 8.25e+02 7.35e+02 2.94e+02 9.11e+02 6.95e+02 5.37e+02 4.20e+01 5.08e+02 1.96e+02 8.20e+02 3.31e+02 7.40e+02 5.20e+02 5.60e+02 9.39e+02 8.57e+02 5.95e+02 4.19e+02 7.87e+02 7.23e+02 9.03e+02 1.65e+02 6.27e+02 5.26e+02 2.56e+02 8.06e+02 5.00e+02 5.59e+02 8.02e+02 6.00e+02 5.69e+02 6.24e+02 2.49e+02 1.17e+02 5.77e+02 6.43e+02 6.02e+02 6.91e+02 6.55e+02 9.47e+02 9.46e+02 4.10e+02 8.05e+02 1.42e+02 1.83e+02 9.06e+02 6.60e+02 6.50e+02 3.50e+01 2.27e+02 5.79e+02 1.74e+02 3.88e+02 3.45e+02 5.92e+02 7.60e+02 1.69e+02 9.75e+02 7.00e+03 6.70e+02 8.11e+02 5.33e+02 4.37e+02 2.23e+02 3.11e+02 8.69e+02 4.00e+00 8.78e+02 9.82e+02 9.60e+02 1.77e+02 1.37e+02 6.42e+02 5.03e+02 2.43e+02 6.39e+02 4.96e+02 2.77e+02 4.48e+02 2.10e+01 2.74e+02 3.96e+02 7.39e+02 4.80e+01 1.80e+04 6.74e+02 9.71e+02 7.88e+02 0.00e+00 8.00e+01 6.30e+01 3.07e+02 9.40e+01 1.61e+02 4.22e+02 9.04e+02 8.72e+02 5.06e+02 4.36e+02 4.30e+02 7.64e+02 5.90e+02 4.27e+02 3.39e+02 9.88e+02 4.66e+02 5.76e+02 5.49e+02 1.30e+02 4.86e+02 9.64e+02 9.70e+02 4.50e+02 3.03e+02 1.03e+02 5.52e+02 8.89e+02 8.45e+02 2.70e+02 5.61e+02 6.92e+02 3.83e+02 4.40e+02 1.70e+01 5.29e+02 3.09e+02 8.43e+02 8.61e+02 8.34e+02 6.97e+02 5.88e+02 4.05e+02 5.85e+02 2.29e+02 3.27e+02 2.17e+02 8.41e+02 3.24e+02 3.26e+02 2.20e+02 4.51e+02 8.55e+02 1.49e+02 3.63e+02 3.94e+02 9.55e+02 1.45e+02 7.93e+02 6.98e+02 4.95e+02 7.30e+02 5.17e+02 7.18e+02 3.46e+02 5.80e+02 3.17e+02 3.80e+02 1.92e+02 7.83e+02 9.09e+02 5.93e+02 8.50e+01 2.70e+01 5.22e+02 6.10e+02 9.25e+02 9.13e+02 1.80e+01 8.62e+02 9.66e+02 6.64e+02 8.99e+02 3.30e+02 1.47e+02 3.87e+02 5.01e+02 6.17e+02 2.63e+02 4.97e+02 9.76e+02 2.08e+02 2.40e+02 2.57e+02 6.28e+02 3.34e+02 8.84e+02 9.44e+02 6.80e+02 7.20e+02 4.67e+02 6.68e+02 3.01e+02 7.48e+02 4.17e+02 2.19e+02 3.99e+02 3.44e+02 2.33e+02 4.41e+02 1.02e+02 2.65e+02 2.90e+02 3.29e+02 7.08e+02 2.98e+02 9.62e+02 8.91e+02 9.35e+02 8.47e+02 6.63e+02 8.64e+02 5.35e+02 8.27e+02 5.07e+02 8.60e+02 1.10e+02 5.91e+02 2.73e+02 4.00e+01 4.16e+02 5.18e+02 3.90e+02 7.96e+02 9.63e+02 4.60e+02 6.40e+02 8.79e+02 6.05e+02 8.28e+02 5.20e+01 3.67e+02 2.45e+02 6.20e+01 6.37e+02 9.57e+02 5.34e+02 5.68e+02 7.21e+02 1.35e+02 5.00e+01 8.83e+02 9.20e+02 8.18e+02 9.89e+02 2.58e+02 1.31e+02 1.71e+02 6.51e+02 9.12e+02 3.48e+02 9.95e+02 8.22e+02 2.89e+02 8.23e+02 4.14e+02 7.00e+00 4.72e+02 4.55e+02 1.84e+02 5.12e+02 9.23e+02 6.94e+02 6.19e+02 9.40e+02 3.42e+02 5.31e+02 8.44e+02 5.55e+02 2.54e+02 6.29e+02 5.75e+02 9.77e+02 9.54e+02 2.90e+01 7.69e+02 1.15e+02 5.41e+02 4.23e+02 6.23e+02 3.37e+02 5.30e+02 9.49e+02 7.24e+02 9.10e+01 8.07e+02 1.08e+02 9.02e+02 2.79e+02 5.84e+02 6.41e+02 7.06e+02 8.94e+02 4.63e+02 3.04e+02 3.62e+02 4.61e+02 4.84e+02 4.40e+01 6.80e+01 2.02e+02 1.16e+02 1.41e+02 6.11e+02 7.38e+02 8.49e+02 2.80e+01 8.15e+02 3.49e+02 9.22e+02 4.30e+01 7.86e+02 4.89e+02 1.52e+02 3.12e+02 8.10e+01 8.96e+02 4.49e+02 1.57e+02 9.84e+02 8.74e+02 8.76e+02 1.37e+05 8.35e+02 9.80e+02 7.80e+01 2.44e+02 4.26e+02 3.51e+02 6.89e+02 4.13e+02 4.70e+01 7.41e+02 6.90e+02 6.10e+01 6.82e+02 6.83e+02 2.99e+02 6.13e+02 9.41e+02 3.22e+02 2.53e+02 9.26e+02 7.54e+02 9.08e+02 8.87e+02 3.82e+02 5.96e+02 5.45e+02 2.48e+02 4.60e+01 3.00e+00 7.55e+02 1.33e+02 5.42e+02 5.16e+02 2.85e+02 5.43e+02 2.10e+02 3.16e+02 1.63e+02 9.24e+02 4.76e+02 5.97e+02 6.69e+02 7.44e+02 7.07e+02 4.85e+02 5.21e+02 8.39e+02 7.90e+01 6.45e+02 7.34e+02 6.18e+02 3.98e+02 7.42e+02 8.38e+02 3.38e+02 2.24e+02 9.15e+02 6.48e+02 4.88e+02 4.45e+02 1.75e+02 7.30e+01 5.73e+02 3.28e+02 2.14e+02 6.03e+02 2.06e+02 2.20e+01 4.57e+02 2.00e+00 9.00e+00 8.56e+02 7.74e+02 6.49e+02 9.80e+01 1.50e+01 5.94e+02 4.03e+02 9.45e+02 4.91e+02 2.42e+02 9.17e+02 1.60e+02 3.59e+02 2.39e+02 8.77e+02 4.33e+02 6.54e+02 2.04e+02 6.52e+02 4.59e+02 4.39e+02 2.55e+02 2.21e+02 1.89e+02 7.20e+01 8.21e+02 5.71e+02 3.30e+01 7.12e+02 2.71e+02 6.81e+02 2.91e+02 1.99e+02 7.76e+02 9.42e+02 4.01e+02 6.86e+02 9.30e+01 3.43e+02 5.86e+02 2.36e+02 1.43e+02 7.85e+02 6.78e+02 3.78e+02 1.81e+02 1.10e+01 4.09e+02 9.48e+02 4.29e+02 5.90e+01 4.02e+02 4.47e+02 7.63e+02 2.86e+02 1.94e+02 3.47e+02 4.28e+02 6.26e+02 9.37e+02 2.61e+02 9.27e+02 7.27e+02 9.85e+02 9.90e+01 4.82e+02 5.00e+00 8.88e+02 1.54e+02 1.46e+02 1.32e+02 5.28e+02 2.64e+02 6.93e+02 5.47e+02 2.32e+02 5.87e+02 9.50e+01 2.52e+02 6.01e+02 6.50e+01 2.13e+02 6.77e+02 3.71e+02 1.39e+02 7.36e+02 1.11e+02 3.77e+02 8.90e+01 7.32e+02 6.73e+02 6.34e+02 4.44e+02 8.37e+02 3.74e+02 1.55e+02 2.46e+02 4.34e+02 9.18e+02 4.99e+02 2.00e+01 5.40e+01 3.00e+01 3.10e+01 2.66e+02 2.34e+02 7.49e+02 1.73e+02 8.70e+02 1.28e+02 4.43e+02 2.95e+02 4.81e+02 8.59e+02 8.30e+01 5.04e+02 9.91e+02 2.28e+02 7.97e+02 2.81e+02 8.60e+01 1.23e+02 5.39e+02 2.50e+01 7.75e+02 2.41e+02 2.01e+02 1.34e+02 1.04e+02 2.26e+02 8.75e+02 1.00e+01 9.01e+02 9.00e+02 3.66e+02 8.30e+02 9.07e+02 4.65e+02 1.68e+02 1.91e+02 3.33e+02 7.60e+01 3.85e+02 1.00e+02 2.97e+02 9.38e+02 5.50e+01 3.70e+01 8.04e+02 3.32e+02 9.20e+01 4.94e+02 8.70e+01 4.10e+01 4.74e+02 6.12e+02 3.84e+02 1.78e+02 9.69e+02 5.32e+02 5.60e+01 7.50e+02 8.67e+02 3.60e+01 2.82e+02 4.71e+02 2.30e+02 3.90e+01 4.24e+02 6.56e+02 4.56e+02 1.44e+02 5.66e+02 8.00e+02 2.62e+02 3.76e+02 1.93e+02 3.41e+02 6.70e+01 2.78e+02 3.06e+02 2.03e+02 2.72e+02 6.36e+02 3.57e+02 7.51e+02 1.22e+02 9.60e+01 2.87e+02 1.18e+02 9.21e+02 1.70e+02 5.70e+01 2.92e+02 3.40e+02 3.80e+01 4.31e+02 9.83e+02 5.11e+02 3.10e+02 6.22e+02 6.46e+02 6.14e+02 1.64e+02 6.60e+01 3.19e+02 4.90e+01 3.97e+02 1.30e+01 4.50e+01 3.53e+02 1.79e+02 1.62e+02 3.55e+02 5.56e+02 2.15e+02 4.69e+02 7.40e+01 3.20e+01 1.14e+02 3.91e+02 6.90e+01 1.25e+02 3.50e+02 9.97e+02 8.42e+02 1.53e+02 3.05e+02 2.38e+02 1.85e+02 3.20e+02 1.88e+02 5.27e+02 7.25e+02 4.83e+02 6.85e+02 6.00e+00 5.44e+02 7.11e+02 1.56e+02 2.90e+04 2.75e+02 4.46e+02 2.11e+02 7.15e+02 1.86e+02 2.18e+02 6.16e+02 2.96e+02 2.00e+02 1.09e+02 1.36e+02 6.08e+02 4.62e+02 4.32e+02 1.76e+02 2.60e+01 3.18e+02 1.27e+02 8.92e+02 1.97e+02 2.12e+02 8.00e+00 2.09e+02 3.14e+02 9.05e+02 5.24e+02 6.59e+02 4.15e+02 4.79e+02 8.65e+02 1.05e+02 2.59e+02 4.21e+02 1.67e+02 1.12e+02 3.73e+02 9.86e+02 5.10e+01 1.01e+02 7.70e+01 2.88e+02 3.40e+01 5.14e+02 3.23e+02 4.38e+02 5.30e+01 4.07e+02 1.20e+02 7.78e+02 5.72e+02 2.80e+02 4.35e+02 6.99e+02 6.15e+02 7.33e+02 4.80e+02 1.26e+02 3.56e+02 2.22e+02 2.31e+02 1.87e+02 6.65e+02 6.33e+02 1.29e+02 6.40e+01 5.82e+02 8.80e+01 1.60e+01 2.51e+02 8.10e+02 8.13e+02 2.60e+02 3.25e+02 1.24e+02 3.02e+02 6.06e+02 7.47e+02 1.40e+01 1.58e+02 2.47e+02 6.75e+02 8.40e+01 3.79e+02 1.59e+02 4.25e+02 3.54e+02 1.48e+02 5.89e+02 7.52e+02 4.53e+02 4.58e+02 7.00e+01 4.54e+02 7.81e+02 2.40e+01 1.21e+02 4.04e+02 4.20e+02 1.90e+01 3.61e+02 3.95e+02 2.07e+02 1.40e+02 5.02e+02 9.00e+01 5.19e+02 4.92e+02 2.35e+02 7.09e+02 4.87e+02 5.80e+01 8.73e+02 3.75e+02 7.50e+01 1.80e+02 1.66e+02 6.57e+02 2.05e+02 4.70e+02] imdb_score: - Total de datos únicos: 78 - Valores: [7.9 7.1 6.8 8.5 6.6 6.2 7.8 7.5 6.9 6.1 6.7 7.3 6.5 7.2 8.1 7. 7.7 8.2 5.9 6. 5.7 6.4 6.3 5.6 8.3 8. 8.4 5.8 5.4 9. 4.8 5.2 7.6 4.5 5.5 8.6 8.8 5.1 7.4 4.2 5. 4.9 3.7 5.3 4.3 3.8 4.4 3.3 2.2 8.9 8.7 4.6 2.4 3.4 4.1 4.7 3. 3.6 3.5 2.7 1.7 4. 2. 9.3 2.9 3.9 2.8 2.3 1.9 3.1 9.5 9.1 1.6 2.5 2.1 3.2 9.2 2.6] aspect_ratio: - Total de datos únicos: 23 - Valores: [ 1.78 2.35 nan 1.85 2. 2.2 2.39 2.24 1.33 4. 1.66 1.5 16. 1.77 2.4 1.37 2.76 1.18 1.44 2.55 1.2 1.75 1.89] movie_facebook_likes: - Total de datos únicos: 876 - Valores: [ 33000 0 85000 164000 24000 29000 118000 10000 197000 5000 48000 123000 58000 40000 65000 56000 17000 83000 26000 72000 44000 150000 80000 95000 60000 41000 30000 94000 129000 82000 92000 22000 115000 23000 46000 20000 39000 16000 13000 54000 37000 27000 42000 2000 77000 18000 53000 89000 45000 677 35000 55000 67000 96000 349000 175000 166000 14000 38000 11000 8000 15000 63000 191000 19000 47000 62000 3000 25000 51000 190000 6000 61000 71000 40 25 52000 31000 122000 97000 459 68000 28000 291 147000 12000 4000 304 36000 894 21000 946 153000 53 199000 108000 138000 124000 881 416 578 66000 701 1000 9000 70000 988 979 788 59000 372 863 49000 941 374 7000 57000 140000 91 607 951 32000 257 665 964 995 785 138 413 893 509 105000 43000 648 683 880 266 886 694 34000 792 531 997 584 391 815 764 617 98000 144000 688 892 177 295 114000 912 146000 885 781 858 747 829 797 64000 621 448 690 956 846 470 589 791 641 426 117000 296 112000 990 500 472 782 960 953 316 610 437 361 853 672 605 74000 90000 795 955 624 970 773 669 812 718 877 705 915 779 697 943 422 288 255 352 828 522 261 616 975 115 86000 743 593 451 916 652 68 353 748 50000 455 161 505 101000 681 663 263 390 445 301 702 394 279 299 680 309 555 604 754 835 630 849 579 612 866 328 401 158 565 211 823 265 89 478 989 689 168 911 389 474 366 240 852 591 133 638 919 913 949 538 452 433 471 704 494 561 675 262 633 26 120 314 742 463 783 350 643 625 559 290 149000 181 883 81000 75000 602 78000 771 592 484 491 999 629 204 462 654 387 287 447 201 209 76000 418 99000 517 73000 464 845 315 817 188 359 736 874 167 260 319 145 826 504 567 488 425 660 542 666 187 329 671 686 657 901 834 564 498 377 246 378 503 215 581 982 348 739 206 83 937 507 942 636 90 200 891 876 619 650 153 839 408 148000 376 532 767 765 887 804 716 104 272 687 758 897 141 365 284 800 131000 806 302 227 347 930 613 590 431 124 501 264 4 269 713 983 977 444 311 855 676 233 924 548 346 847 608 971 235 241 831 973 399 136 217 30 58 70 332 560 238 228 599 588 813 562 827 342 271 320 476 814 110 339 419 300 154 903 393 487 492 371 818 392 573 762 356 921 939 824 282 191 558 165000 896 810 466 243 673 327 175 116 489 566 622 251 436 247 576 923 816 515 647 157 733 843 708 664 193 473 18 821 277 998 373 184 530 889 411 645 932 905 860 231 127 76 512 934 64 502 106000 833 109000 862 679 22 618 186 69 107 139 337 344 770 93000 902 549 151 439 242 656 305 135 423 88 458 784 634 355 443 774 541 745 449 405 550 174 967 85 659 646 119 250 981 744 368 720 331 851 441 808 518 49 398 944 345 208 106 533 74 725 952 33 117 661 321 370 546 402 160 695 486 746 850 957 898 580 838 313 375 128 278 196 341 467 495 938 125 77 859 429 79 412 226 434 75 143 294 520 438 81 205 508 140 31 655 223 108 129 978 407 44 446 357 962 122 724 97 55 453 41 586 479 802 19 830 403 256 985 926 28 84 289 29 52 224 570 176 275 543 109 244 113 421 620 963 232 575 270 738 213 740 207 729 249 920 323 225 274 777 639 38 165 190 35 750 864 11 2 199 285 539 528 62 61 283 842 118 414 931 131 987 872 111 614 182 132 98 325 798 170 237 456 385 358 280 360 121 721 482 606 799 100 63 710 609 933 635 545 837 595 367 417 430 47 450 442 48 54 32 406 395 37 229 480 968 369 349 381 50 254 819 39 16 144 79000 172 870 162 14 380 763 236 60 134 43 166 974 857 10 715 195 825 805 651 728 298 544 511 82 169 8 569 326 73 794 869 844 9 682 415 307 212 216 594 57 400 969 220 66 632 221 756 27 87 477 954 51 234 603 92 571 24 150 587 706 210 793 17 42 259 297 3 936 884 698 409 312 219 130 396 163 420 523 303 102 203 5 12 34 36 7 126 180 86 155 183 23 707 123 460 865 194 526 691 583 379 363 114 46 178 93 67 519 71 72 801 465 96 198 267 142 65 197 667 239 13 535 45 324 171 105 424 20]
Análisis general del dataset¶
Descripción de información estadística básica y general del dataset
dataset.describe()
| num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_1_facebook_likes | gross | num_voted_users | cast_total_facebook_likes | facenumber_in_poster | num_user_for_reviews | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 4993.000000 | 5028.000000 | 4939.000000 | 5020.000000 | 5036.000000 | 4.159000e+03 | 5.043000e+03 | 5043.000000 | 5030.000000 | 5022.000000 | 4.551000e+03 | 4935.000000 | 5030.000000 | 5043.000000 | 4714.000000 | 5043.000000 |
| mean | 140.194272 | 107.201074 | 686.509212 | 645.009761 | 6560.047061 | 4.846841e+07 | 8.366816e+04 | 9699.063851 | 1.371173 | 272.770808 | 3.975262e+07 | 2002.470517 | 1651.754473 | 6.442138 | 2.220403 | 7525.964505 |
| std | 121.601675 | 25.197441 | 2813.328607 | 1665.041728 | 15020.759120 | 6.845299e+07 | 1.384853e+05 | 18163.799124 | 2.013576 | 377.982886 | 2.061149e+08 | 12.474599 | 4042.438863 | 1.125116 | 1.385113 | 19320.445110 |
| min | 1.000000 | 7.000000 | 0.000000 | 0.000000 | 0.000000 | 1.620000e+02 | 5.000000e+00 | 0.000000 | 0.000000 | 1.000000 | 2.180000e+02 | 1916.000000 | 0.000000 | 1.600000 | 1.180000 | 0.000000 |
| 25% | 50.000000 | 93.000000 | 7.000000 | 133.000000 | 614.000000 | 5.340988e+06 | 8.593500e+03 | 1411.000000 | 0.000000 | 65.000000 | 6.000000e+06 | 1999.000000 | 281.000000 | 5.800000 | 1.850000 | 0.000000 |
| 50% | 110.000000 | 103.000000 | 49.000000 | 371.500000 | 988.000000 | 2.551750e+07 | 3.435900e+04 | 3090.000000 | 1.000000 | 156.000000 | 2.000000e+07 | 2005.000000 | 595.000000 | 6.600000 | 2.350000 | 166.000000 |
| 75% | 195.000000 | 118.000000 | 194.500000 | 636.000000 | 11000.000000 | 6.230944e+07 | 9.630900e+04 | 13756.500000 | 2.000000 | 326.000000 | 4.500000e+07 | 2011.000000 | 918.000000 | 7.200000 | 2.350000 | 3000.000000 |
| max | 813.000000 | 511.000000 | 23000.000000 | 23000.000000 | 640000.000000 | 7.605058e+08 | 1.689764e+06 | 656730.000000 | 43.000000 | 5060.000000 | 1.221550e+10 | 2016.000000 | 137000.000000 | 9.500000 | 16.000000 | 349000.000000 |
Descripción de tipos y cantidades de datos non-nulls en el dataset
dataset.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 5043 entries, 0 to 5042 Data columns (total 28 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 color 5024 non-null object 1 director_name 4939 non-null object 2 num_critic_for_reviews 4993 non-null float64 3 duration 5028 non-null float64 4 director_facebook_likes 4939 non-null float64 5 actor_3_facebook_likes 5020 non-null float64 6 actor_2_name 5030 non-null object 7 actor_1_facebook_likes 5036 non-null float64 8 gross 4159 non-null float64 9 genres 5043 non-null object 10 actor_1_name 5036 non-null object 11 movie_title 5043 non-null object 12 num_voted_users 5043 non-null int64 13 cast_total_facebook_likes 5043 non-null int64 14 actor_3_name 5020 non-null object 15 facenumber_in_poster 5030 non-null float64 16 plot_keywords 4890 non-null object 17 movie_imdb_link 5043 non-null object 18 num_user_for_reviews 5022 non-null float64 19 language 5029 non-null object 20 country 5038 non-null object 21 content_rating 4740 non-null object 22 budget 4551 non-null float64 23 title_year 4935 non-null float64 24 actor_2_facebook_likes 5030 non-null float64 25 imdb_score 5043 non-null float64 26 aspect_ratio 4714 non-null float64 27 movie_facebook_likes 5043 non-null int64 dtypes: float64(13), int64(3), object(12) memory usage: 1.1+ MB
Limpieza de columnas irrelevantes o "sucias"¶
Las siguientes variables serán eliminadas del dataset, dado que están generando ruido en el modelo, y no están aportando un valor real en el valor predictivo:
| Variable | Motivo |
|---|---|
movie_imdb_link |
Es un identificador externo (URL) |
movie_title |
Es un indicador nominal que no aporta valor predictivo |
director_name |
Es un indicador nominal que no aporta valor predictivo |
actor_1_name |
Es un indicador nominal que no aporta valor predictivo |
actor_2_name |
Es un indicador nominal que no aporta valor predictivo |
actor_3_name |
Es un indicador nominal que no aporta valor predictivo |
plot_keywords |
Texto libre que sería útil usando NLP |
cols_to_drop = [
'movie_imdb_link', 'movie_title', 'director_name',
'actor_1_name', 'actor_2_name', 'actor_3_name',
'plot_keywords',
]
dataset = dataset.drop(columns=cols_to_drop)
dataset.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 5043 entries, 0 to 5042 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 color 5024 non-null object 1 num_critic_for_reviews 4993 non-null float64 2 duration 5028 non-null float64 3 director_facebook_likes 4939 non-null float64 4 actor_3_facebook_likes 5020 non-null float64 5 actor_1_facebook_likes 5036 non-null float64 6 gross 4159 non-null float64 7 genres 5043 non-null object 8 num_voted_users 5043 non-null int64 9 cast_total_facebook_likes 5043 non-null int64 10 facenumber_in_poster 5030 non-null float64 11 num_user_for_reviews 5022 non-null float64 12 language 5029 non-null object 13 country 5038 non-null object 14 content_rating 4740 non-null object 15 budget 4551 non-null float64 16 title_year 4935 non-null float64 17 actor_2_facebook_likes 5030 non-null float64 18 imdb_score 5043 non-null float64 19 aspect_ratio 4714 non-null float64 20 movie_facebook_likes 5043 non-null int64 dtypes: float64(13), int64(3), object(5) memory usage: 827.5+ KB
dataset.head()
| color | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_1_facebook_likes | gross | genres | num_voted_users | cast_total_facebook_likes | ... | num_user_for_reviews | language | country | content_rating | budget | title_year | actor_2_facebook_likes | imdb_score | aspect_ratio | movie_facebook_likes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Color | 723.0 | 178.0 | 0.0 | 855.0 | 1000.0 | 760505847.0 | Action|Adventure|Fantasy|Sci-Fi | 886204 | 4834 | ... | 3054.0 | English | USA | PG-13 | 237000000.0 | 2009.0 | 936.0 | 7.9 | 1.78 | 33000 |
| 1 | Color | 302.0 | 169.0 | 563.0 | 1000.0 | 40000.0 | 309404152.0 | Action|Adventure|Fantasy | 471220 | 48350 | ... | 1238.0 | English | USA | PG-13 | 300000000.0 | 2007.0 | 5000.0 | 7.1 | 2.35 | 0 |
| 2 | Color | 602.0 | 148.0 | 0.0 | 161.0 | 11000.0 | 200074175.0 | Action|Adventure|Thriller | 275868 | 11700 | ... | 994.0 | English | UK | PG-13 | 245000000.0 | 2015.0 | 393.0 | 6.8 | 2.35 | 85000 |
| 3 | Color | 813.0 | 164.0 | 22000.0 | 23000.0 | 27000.0 | 448130642.0 | Action|Thriller | 1144337 | 106759 | ... | 2701.0 | English | USA | PG-13 | 250000000.0 | 2012.0 | 23000.0 | 8.5 | 2.35 | 164000 |
| 4 | NaN | NaN | NaN | 131.0 | NaN | 131.0 | NaN | Documentary | 8 | 143 | ... | NaN | NaN | NaN | NaN | NaN | NaN | 12.0 | 7.1 | NaN | 0 |
5 rows × 21 columns
Inferencia de tipos de variables¶
Definir target de análisis. Según el objetivo del análisis, se debe determinar la variable objetivo, por ejemplo:
- Para predicción de validad o éxito de un película:
imdb_score - Para predicción de ingresos:
gross - Para predicción de clasificación de audiencia:
content_rating - Para predicción de popularidad en redes sociales:
movie_facebook_likes - Para predicción de año de lanzamiento:
title_year - Para predicción de clasificación de género:
genres
En este ejemplo usaré la variable imdb_score como target
target = 'imdb_score'
features = [i for i in dataset.columns if i not in [target]]
number_unique_rows = dataset[features].nunique()
Inferencia de features numéricos y categóricos
numerical_features = [];
categorical_features = [];
for col in features:
if dataset[col].dtype == 'object' or number_unique_rows[col] <= 45:
categorical_features.append(col)
else:
numerical_features.append(col)
print('\n\033[1mInferencia:\033[0m El dataset tiene {} features numéricas y {} features categóricas.'.format(len(numerical_features),len(categorical_features)))
Inferencia: El dataset tiene 13 features numéricas y 7 features categóricas.
Validación de datos nulos¶
Validación de cantidad de nulos en el dataset
dataset.isnull().sum()
color 19 num_critic_for_reviews 50 duration 15 director_facebook_likes 104 actor_3_facebook_likes 23 actor_1_facebook_likes 7 gross 884 genres 0 num_voted_users 0 cast_total_facebook_likes 0 facenumber_in_poster 13 num_user_for_reviews 21 language 14 country 5 content_rating 303 budget 492 title_year 108 actor_2_facebook_likes 13 imdb_score 0 aspect_ratio 329 movie_facebook_likes 0 dtype: int64
Porcentaje de nulos por feature
for key in dataset.keys():
null_sum = dataset[key].isnull().sum()
if null_sum > 0:
percentage = null_sum/dataset.shape[0] * 100
print(f"\033[1m{key}:\033[0m {format_decimals(percentage)}%")
color: 0.38% num_critic_for_reviews: 0.99% duration: 0.3% director_facebook_likes: 2.06% actor_3_facebook_likes: 0.46% actor_1_facebook_likes: 0.14% gross: 17.53% facenumber_in_poster: 0.26% num_user_for_reviews: 0.42% language: 0.28% country: 0.1% content_rating: 6.01% budget: 9.76% title_year: 2.14% actor_2_facebook_likes: 0.26% aspect_ratio: 6.52%
fig, ax = plt.subplots(figsize=(15, 15))
sns.heatmap(dataset.isnull(), cbar=False, cmap="viridis")
<Axes: >
Imputación de datos¶
- Técnica de imputación con Moda: Valores más frecuentes para las columnas
color,facenumber_in_poster,language,aspect_ratio
features = ["color", "facenumber_in_poster", "language", "aspect_ratio"]
def impute_value_by_mode(variables):
for var in variables:
var_mode = dataset[var].mode()[0]
print(f"{var}: {var_mode}")
dataset[var] = dataset[var].fillna(var_mode)
print(f"Validación de nulos para {var}: {dataset[var].isnull().sum()}")
impute_value_by_mode(features)
color: Color Validación de nulos para color: 0 facenumber_in_poster: 0.0 Validación de nulos para facenumber_in_poster: 0 language: English Validación de nulos para language: 0 aspect_ratio: 2.35 Validación de nulos para aspect_ratio: 0
- Técnica de imputación con Categoría: "Desconocido" en caso de null para las columnas de
content_rating(Originalmente se pensaba imputar también las columnasdirector_name,actor_1_name,actor_2_name,actor_3_name, yplot_keywords, pero, se descartaron al momento de limpiar las columnas sucias del dataset)
# features = ["director_name", "actor_1_name", "actor_2_name", "actor_3_name", "plot_keywords", "content_rating"]
features = ["content_rating"]
def impute_value_by_category(variables):
for var in variables:
category = "Unknown"
dataset[var] = dataset[var].fillna(category)
print(f"Validación de nulos para {var}: {dataset[var].isnull().sum()}")
impute_value_by_category(features)
Validación de nulos para content_rating: 0
- Técnica de imputación con Media: Media de valores para las columnas
num_critic_for_reviews,num_user_for_reviews
features = ["num_critic_for_reviews", "num_user_for_reviews"]
def impute_value_by_mean(variables):
for var in variables:
var_mean = dataset[var].mean()
print(f"{var}: {var_mean}")
dataset[var] = dataset[var].fillna(var_mean)
print(f"Validación de nulos para {var}: {dataset[var].isnull().sum()}")
impute_value_by_mean(features)
num_critic_for_reviews: 140.1942719807731 Validación de nulos para num_critic_for_reviews: 0 num_user_for_reviews: 272.77080844285143 Validación de nulos para num_user_for_reviews: 0
- Técnica de imputación con Mediana: Mediana de valores para las columnas
duration,title_year
features = ["duration", "title_year"]
def impute_value_by_median(variables):
for var in variables:
var_median = dataset[var].median()
print(f"{var}: {var_median}")
dataset[var] = dataset[var].fillna(var_median)
print(f"Validación de nulos para {var}: {dataset[var].isnull().sum()}")
impute_value_by_median(features)
duration: 103.0 Validación de nulos para duration: 0 title_year: 2005.0 Validación de nulos para title_year: 0
- Técnica de imputación con Ceros: Imputación con 0 para las columnas
director_facebook_likes,actor_1_facebook_likes,actor_2_facebook_likes,actor_3_facebook_likes
features = ["director_facebook_likes", "actor_1_facebook_likes", "actor_2_facebook_likes", "actor_3_facebook_likes"]
def impute_value_with_zeros(variables):
for var in variables:
dataset[var] = dataset[var].fillna(0)
print(f"Validación de nulos para {var}: {dataset[var].isnull().sum()}")
impute_value_with_zeros(features)
Validación de nulos para director_facebook_likes: 0 Validación de nulos para actor_1_facebook_likes: 0 Validación de nulos para actor_2_facebook_likes: 0 Validación de nulos para actor_3_facebook_likes: 0
- Técnica de imputación por Interpolación: Interpolación de las columnas
gross,budget.
features = ["gross", "budget"]
def impute_value_by_interpolate(variables):
for var in variables:
dataset[var] = dataset[var].interpolate(method="linear")
print(f"Validación de nulos para {var}: {dataset[var].isnull().sum()}")
impute_value_by_interpolate(features)
Validación de nulos para gross: 0 Validación de nulos para budget: 0
- Técnica de imputación condicional: Condicional para
countrybasado en el idioma.
country_by_language = {
'Aboriginal': "Australia",
'Arabic': random.choice(["Egypt", "Libya", "United Arab Emirates"]),
'Aramaic': random.choice(["Siria", "Irak"]),
'Bosnian': "Bosnia",
'Cantonese': random.choice(["Hong Kong", "China"]),
'Chinese': "China",
'Czech': "Czech Republic",
'Danish': "Denmark",
'Dari': "Afghanistan",
'Dutch': random.choice(["Netherlands", "Belgium", "Aruba"]),
'Dzongkha': "Butan",
'English': random.choice(['USA', 'UK', 'New Zealand', 'Canada', 'Australia', 'Ireland', 'South Africa', 'Bahamas', 'Nigeria', 'Philippines']),
'Filipino': "Philippines",
'French': random.choice(["France", "Belgium", "Canada", "Switzerland", "Cameroon"]),
'German': random.choice(["Germany", "Austria", "Switzerland", "West Germany"]),
'Greek': "Greece",
'Hebrew': "Israel",
'Hindi': "India",
'Hungarian': "Hungary",
'Icelandic': "Iceland",
'Indonesian': "Indonesia",
'Italian': random.choice(["Italy", "Switzerland"]),
'Japanese': "Japan",
'Kannada': "India",
'Kazakh': "Kazakhstan",
'Korean': "South Korea",
'Mandarin': random.choice(["China", "Taiwan"]),
'Maya': "Mexico",
'Mongolian': "Mongolia",
'Norwegian': "Norway",
'Panjabi': random.choice(["Pakistan", "India"]),
'Persian': random.choice(["Iran", "Afghanistan"]),
'Polish': "Poland",
'Portuguese': random.choice(["Brazil", "Portugal"]),
'Romanian': "Romania",
'Russian': random.choice(["Russia", "Soviet Union", "Kyrgyzstan"]),
'Slovenian': "Slovenia",
'Spanish': random.choice(["Mexico", "Spain", "Argentina", "Colombia", "Chile", "Panama", "Peru", "Dominican Republic"]),
'Swahili': "Kenya",
'Swedish': random.choice(["Sweden", "Finland"]),
'Tamil': random.choice(["India", "Sri Lanka"]),
'Telugu': "India",
'Thai': "Thailand",
'Urdu': random.choice(["Pakistan", "India"]),
'Vietnamese': "Vietnam",
'Zulu': "South Africa",
}
def impute_country(row):
if pd.isnull(row['country']):
return country_by_language.get(row["language"], row['country'])
return row['country']
dataset['country'] = dataset.apply(impute_country, axis=1) # type: ignore
print(f"Validación de nulos para country: {dataset['country'].isnull().sum()}")
Validación de nulos para country: 0
Validación de datos nulos post-imputación¶
Revisión de conteo de nulos
dataset.isnull().sum()
color 0 num_critic_for_reviews 0 duration 0 director_facebook_likes 0 actor_3_facebook_likes 0 actor_1_facebook_likes 0 gross 0 genres 0 num_voted_users 0 cast_total_facebook_likes 0 facenumber_in_poster 0 num_user_for_reviews 0 language 0 country 0 content_rating 0 budget 0 title_year 0 actor_2_facebook_likes 0 imdb_score 0 aspect_ratio 0 movie_facebook_likes 0 dtype: int64
Revisión de mapa de calor para nulos
fig, ax = plt.subplots(figsize=(15, 15))
sns.heatmap(dataset.isnull(), cbar=False, cmap="viridis")
<Axes: >
Análisis de Exploración de Data (EDA)¶
is_interactive = False
Boxplot Crudo¶
Mediante un diagrama de boxplot puedo observar cómo se distribuyen o concentran los valores en el dataset, los valores que se encuentran fuera de los bigotes indican películas con puntajes inusualmente altos o bajos, a los cuales conocemos como outliers. Otra función de este diagrama, es poder determinar simetría o sesgo al observar si la caja está centrada o inclinada.
Adicionalmente, puedo determinar si es necesario filtrar elementos por poca relevancia dentro del dataset.
if is_interactive:
EDAVisualizerInteractive.plot_boxplot( "Comparación de IMDb Score por País", dataset, "country", "imdb_score", "País", "IMDb Score")
else:
EDAVisualizerStatic.plot_boxplot( "Comparación de IMDb Score por País", dataset, "country", "imdb_score", "País", "IMDb Score")
Histogramas¶
Observar el comportamiento de la relación entre los países con suficientes registros en el dataset vs la variable target. Determinaré que el mínimo de registros debe ser mayor o igual al primer cuartil de frecuencia (Q1).
country_counts = dataset["country"].value_counts()
q1 = country_counts.quantile(0.25)
filtered_countries = country_counts.loc[lambda s: s > q1].index.tolist()
if is_interactive:
EDAVisualizerInteractive.plot_histogram_by_category(
title=f"Distribución de IMDb Score en {filtered_countries[0]}" if filtered_countries else "Distribución de IMDb Score",
subtitle="Distribución de IMDb Score en",
data=dataset,
category_col="country",
value_col=target,
categories=filtered_countries,
nbins=20,
x_label="IMDb Score",
y_label="Frecuencia",
save_html=True,
save_folder="assets",
)
else:
for country in filtered_countries:
EDAVisualizerStatic.plot_histogram(
title=f"Histograma de IMDb Score para {country}",
data=dataset[dataset["country"] == country],
column=target,
x_label="IMDb Score",
y_label="Frecuencia",
bins=20,
x_range=(0, 10)
)
Filtrado del dataset¶
dataset_filtered = dataset[dataset["country"].isin(filtered_countries)]
Boxplot filtrado¶
Comparar los países mediante un diagrama boxplot, sin tener en cuenta a los países con pocos registros
if is_interactive:
EDAVisualizerInteractive.plot_boxplot(
title = "Comparación filtrada de IMDb Score por País",
data = dataset_filtered,
x="country", y="imdb_score", x_label="País", y_label="IMDb Score"
)
else:
EDAVisualizerStatic.plot_boxplot(
title = "Comparación filtrada de IMDb Score por País",
data = dataset_filtered,
x="country", y="imdb_score", x_label="País", y_label="IMDb Score"
)
Visualización de features categóricos¶
if is_interactive:
EDAVisualizerInteractive.plot_categorical_counts_dropdown(
title="Frecuencia de categorías",
data=dataset_filtered,
categorical_cols=categorical_features,
top_n=8, # aplica Top 8 cuando hay muchas categorías
template="plotly_white",
save_html=True, # guarda en assets/Frecuencia_de_categorías.html
save_folder="assets",
show=True
)
else:
EDAVisualizerStatic.plot_categorical_counts_grid(
data=dataset_filtered,
categorical_cols=categorical_features,
n_cols=2,
top_n=8,
rotate_xticks=45, # como en tu código original
show=True
)
Visualización de features numéricos¶
if is_interactive:
EDAVisualizerInteractive.plot_numerical_hists_dropdown(
title="Distribuciones numéricas",
data=dataset_filtered,
numerical_cols=numerical_features,
nbins=30,
template="plotly_white",
save_html=True, # guarda en assets/Distribuciones_numéricas.html
save_folder="assets",
show=True
)
else:
n = 2
plt.figure(figsize=[15, 3 * math.ceil(len(numerical_features) / n)])
for i, col in enumerate(numerical_features):
plt.subplot(math.ceil(len(numerical_features) / n), n, i + 1)
sns.histplot(data=dataset_filtered, x=col, bins=30, kde=True, color='steelblue')
plt.title(col)
plt.xlabel(col)
plt.ylabel("Frecuencia")
plt.tight_layout(pad=2.0)
plt.subplots_adjust(hspace=0.6, wspace=0.4)
plt.show()
Matriz de dispersión¶
Entender la relación entre todas las características
if is_interactive:
EDAVisualizerInteractive.plot_pairplot(
dataset = dataset_filtered,
title = 'Matriz de dispersión interactiva',
height = 2500,
width = 2500,
)
else:
EDAVisualizerStatic.plot_pairplot(
dataset = dataset_filtered,
title = 'Pairplots for all the Feature'
)
Eliminar outliers¶
Se eliminan las filas donde alguna variable tiene un z-score (puntaje estándar que indica cuantas desviaciones estándar se encuentra un valor con respecto a la media de su distribución) mayor a 3.
Si el valor de z-score está en 0, el valor está exactamente en la media. Si es +1, el valor está 1 desviación estándar por encima de la media. Si el valor es -2, el valor está 2 desviaciones estándar por debajo de la media. Si el valor es mayor a +3 o menor a -3, el valor se considera un outlier.
z_scores = np.abs(zscore(dataset_filtered.select_dtypes(include='number'))) # type: ignore
dataset_no_outliers = dataset_filtered[(z_scores < 3).all(axis=1)]
Se calcula el IQR solo sobre las columnas numéricas y se filtran filas sin outliers.
numeric_cols = dataset_filtered.select_dtypes(include='number')
Q1 = numeric_cols.quantile(0.25)
Q3 = numeric_cols.quantile(0.75)
IQR = Q3 - Q1
dataset_no_outliers = dataset_filtered[~((numeric_cols < (Q1 - 1.5 * IQR)) | (numeric_cols > (Q3 + 1.5 * IQR))).any(axis=1)]
print('\n\033[1mInferencia: \033[0mAntes de remover outliers, el dataset tenía {} ejemplos.'.format(dataset_filtered.shape[0]))
print('Después de remover los outliers, el dataset ahora tiene {} ejemmplos.'.format(dataset_no_outliers.shape[0]))
Inferencia: Antes de remover outliers, el dataset tenía 5015 ejemplos.
Después de remover los outliers, el dataset ahora tiene 2349 ejemmplos.
if is_interactive:
EDAVisualizerInteractive.plot_boxplot(
title = "Comparación sin outliers de IMDb Score por País",
data = dataset_no_outliers,
x="country", y="imdb_score", x_label="País", y_label="IMDb Score"
)
else:
EDAVisualizerStatic.plot_boxplot(
title = "Comparación sin outliers de IMDb Score por País",
data = dataset_no_outliers,
x="country", y="imdb_score", x_label="País", y_label="IMDb Score"
)
One hot encoding¶
dataset_no_outliers = dataset_no_outliers.copy()
dataset_no_outliers['genres'] = dataset_no_outliers['genres'].fillna('')
dataset_no_outliers['genres_list'] = dataset_no_outliers['genres'].apply(lambda x: x.split('|'))
mlb = MultiLabelBinarizer()
genres_encoded = pd.DataFrame(mlb.fit_transform(dataset_no_outliers['genres_list']), columns=mlb.classes_, index=dataset_no_outliers.index) # type: ignore
dataset_no_outliers = pd.concat([dataset_no_outliers, genres_encoded], axis=1)
dataset_no_outliers = dataset_no_outliers.drop('genres', axis=1)
dataset_no_outliers = dataset_no_outliers.drop('genres_list', axis=1)
categorical_features.remove("genres")
dataset_no_outliers.head()
| color | num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_1_facebook_likes | gross | num_voted_users | cast_total_facebook_likes | facenumber_in_poster | ... | Music | Musical | Mystery | News | Romance | Sci-Fi | Sport | Thriller | War | Western | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 177 | Color | 21.0 | 60.0 | 0.0 | 184.0 | 982.0 | 93655348.5 | 16769 | 1687 | 2.0 | ... | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 215 | Color | 85.0 | 102.0 | 323.0 | 241.0 | 845.0 | 32694788.0 | 101411 | 1815 | 1.0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 242 | Color | 33.0 | 116.0 | 0.0 | 141.0 | 936.0 | 114038688.0 | 20567 | 1609 | 1.0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 306 | Color | 174.0 | 121.0 | 0.0 | 595.0 | 1000.0 | 66862068.0 | 89509 | 3903 | 0.0 | ... | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 324 | Color | 97.0 | 110.0 | 342.0 | 393.0 | 623.0 | 10200000.0 | 18697 | 1722 | 0.0 | ... | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
5 rows × 42 columns
dataset_no_outliers.info()
<class 'pandas.core.frame.DataFrame'> Index: 2349 entries, 177 to 5042 Data columns (total 42 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 color 2349 non-null object 1 num_critic_for_reviews 2349 non-null float64 2 duration 2349 non-null float64 3 director_facebook_likes 2349 non-null float64 4 actor_3_facebook_likes 2349 non-null float64 5 actor_1_facebook_likes 2349 non-null float64 6 gross 2349 non-null float64 7 num_voted_users 2349 non-null int64 8 cast_total_facebook_likes 2349 non-null int64 9 facenumber_in_poster 2349 non-null float64 10 num_user_for_reviews 2349 non-null float64 11 language 2349 non-null object 12 country 2349 non-null object 13 content_rating 2349 non-null object 14 budget 2349 non-null float64 15 title_year 2349 non-null float64 16 actor_2_facebook_likes 2349 non-null float64 17 imdb_score 2349 non-null float64 18 aspect_ratio 2349 non-null float64 19 movie_facebook_likes 2349 non-null int64 20 Action 2349 non-null int64 21 Adventure 2349 non-null int64 22 Animation 2349 non-null int64 23 Biography 2349 non-null int64 24 Comedy 2349 non-null int64 25 Crime 2349 non-null int64 26 Documentary 2349 non-null int64 27 Drama 2349 non-null int64 28 Family 2349 non-null int64 29 Fantasy 2349 non-null int64 30 History 2349 non-null int64 31 Horror 2349 non-null int64 32 Music 2349 non-null int64 33 Musical 2349 non-null int64 34 Mystery 2349 non-null int64 35 News 2349 non-null int64 36 Romance 2349 non-null int64 37 Sci-Fi 2349 non-null int64 38 Sport 2349 non-null int64 39 Thriller 2349 non-null int64 40 War 2349 non-null int64 41 Western 2349 non-null int64 dtypes: float64(13), int64(25), object(4) memory usage: 789.1+ KB
Variables dummies¶
# language_dummies = pd.get_dummies(dataset_no_outliers['language'], prefix='lang')
# dataset_no_outliers = dataset_no_outliers.drop('language', axis=1)
# dataset_no_outliers = pd.concat([dataset_no_outliers, language_dummies], axis=1)
dummies = pd.get_dummies(dataset_no_outliers[categorical_features], drop_first=True)
dataset_no_outliers = pd.concat([dataset_no_outliers.drop(columns=categorical_features), dummies], axis=1)
dataset_no_outliers.head()
| num_critic_for_reviews | duration | director_facebook_likes | actor_3_facebook_likes | actor_1_facebook_likes | gross | num_voted_users | cast_total_facebook_likes | num_user_for_reviews | budget | ... | content_rating_PG | content_rating_PG-13 | content_rating_R | content_rating_TV-14 | content_rating_TV-G | content_rating_TV-MA | content_rating_TV-PG | content_rating_Unknown | content_rating_Unrated | content_rating_X | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 177 | 21.0 | 60.0 | 0.0 | 184.0 | 982.0 | 93655348.5 | 16769 | 1687 | 74.0 | 1500000.0 | ... | False | False | False | True | False | False | False | False | False | False |
| 215 | 85.0 | 102.0 | 323.0 | 241.0 | 845.0 | 32694788.0 | 101411 | 1815 | 546.0 | 85000000.0 | ... | False | False | True | False | False | False | False | False | False | False |
| 242 | 33.0 | 116.0 | 0.0 | 141.0 | 936.0 | 114038688.0 | 20567 | 1609 | 36.0 | 78000000.0 | ... | False | False | False | False | False | False | False | True | False | False |
| 306 | 174.0 | 121.0 | 0.0 | 595.0 | 1000.0 | 66862068.0 | 89509 | 3903 | 524.0 | 83000000.0 | ... | False | False | True | False | False | False | False | False | False | False |
| 324 | 97.0 | 110.0 | 342.0 | 393.0 | 623.0 | 10200000.0 | 18697 | 1722 | 263.0 | 10000000.0 | ... | True | False | False | False | False | False | False | False | False | False |
5 rows × 118 columns
dataset_no_outliers.info()
<class 'pandas.core.frame.DataFrame'> Index: 2349 entries, 177 to 5042 Columns: 118 entries, num_critic_for_reviews to content_rating_X dtypes: bool(80), float64(13), int64(25) memory usage: 899.2 KB
Inferencia de tipos de variables post one hot encoding¶
target = 'imdb_score'
features = [i for i in dataset_no_outliers.columns if i not in [target]]
number_unique_rows = dataset_no_outliers[features].nunique()
numerical_features = [];
categorical_features = [];
for col in features:
if dataset_no_outliers[col].dtype == 'object' or number_unique_rows[col] <= 45:
categorical_features.append(col)
else:
numerical_features.append(col)
print('\n\033[1mInferencia:\033[0m El dataset tiene {} features numéricas y {} features categóricas.'.format(len(numerical_features),len(categorical_features)))
Inferencia: El dataset tiene 12 features numéricas y 105 features categóricas.
Matriz de correlación¶
correlation_matrix = dataset_no_outliers.select_dtypes(include=['int64', 'float64']).corr().round(2)
if is_interactive:
EDAVisualizerInteractive.plot_heatmap(correlation_matrix, "Mapa de Calor Interactivo de Correlaciones", 1200, 1200, 7)
else:
EDAVisualizerStatic.plot_heatmap(correlation_matrix, figsize=(50,25))
Transformación de matriz de correlación en formato de pares
correlation_pairs = correlation_matrix.unstack()
correlation_pairs
num_critic_for_reviews num_critic_for_reviews 1.00
duration 0.16
director_facebook_likes 0.21
actor_3_facebook_likes 0.13
actor_1_facebook_likes 0.20
...
aspect_ratio Thriller 0.14
War 0.08
Western 0.07
facenumber_in_poster 0.02
aspect_ratio 1.00
Length: 1444, dtype: float64
Filtrado de variables con una correlación mayor al 0.6
threshold = 0.6
sorted_pairs = correlation_pairs.sort_values(ascending=False) # type: ignore
high_correlated_pairs = sorted_pairs[((sorted_pairs > threshold) | (sorted_pairs < -threshold)) & (sorted_pairs != 1)]
high_correlated_pairs
actor_1_facebook_likes cast_total_facebook_likes 0.98 cast_total_facebook_likes actor_1_facebook_likes 0.98 actor_2_facebook_likes actor_3_facebook_likes 0.85 actor_3_facebook_likes actor_2_facebook_likes 0.85 num_voted_users num_user_for_reviews 0.74 num_user_for_reviews num_voted_users 0.74 num_critic_for_reviews num_user_for_reviews 0.64 num_user_for_reviews num_critic_for_reviews 0.64 dtype: float64
vars_corr = list(set([i[0] for i in high_correlated_pairs.index] + [i[1] for i in high_correlated_pairs.index]))
filtered_corr = correlation_matrix.loc[vars_corr, vars_corr]
if is_interactive:
EDAVisualizerInteractive.plot_heatmap(filtered_corr, "Mapa de Calor de Pares Altamente Correlacionados", 800, 1200, 12)
else:
EDAVisualizerStatic.plot_heatmap(filtered_corr, figsize=(20,10))
Con los resultados anteriores, se puede observar que existen variables altamente correlacionadas que se pueden interpretar de la siguiente manera:
cast_total_facebook_likesyactor_1_facebook_likes→ 0.98: El actor principal suele tener una gran influencia en el total de likes del elenco. Podrías considerar eliminar una de estas variables en modelos lineales para evitar colinealidad.actor_2_facebook_likesyactor_3_facebook_likes→ 0.85: Los actores secundarios tienden a tener niveles de popularidad similares, posiblemente por compartir tipo de rol o nivel de exposición.num_voted_usersynum_user_for_reviews→ 0.74: A mayor cantidad de votos, mayor cantidad de reseñas. Esto puede reflejar popularidad.num_critic_for_reviewsynum_user_for_reviews→ 0.64: - Las películas con más reseñas de usuarios también tienden a recibir más atención de críticos. Refleja visibilidad mediática.
Reducción de dimensionalidad¶
Para la reducción de dimensionalidad, usaré PCA (Análisis de Componentes Principales). El PCA servirá para reducir simplificar el dataset con muchas variables numéricas. Además, elimina redundancia si varias columnas están correlacionadas al combinarlo en componentes más informativos.
x_num = dataset_no_outliers.drop(columns=['imdb_score']).select_dtypes(include=['int64', 'float64']).dropna()
x_scaled = StandardScaler().fit_transform(x_num)
Dimensionalidad en 2D¶
pca = PCA(n_components=2)
x_pca = pca.fit_transform(x_scaled)
plt.scatter(x_pca[:, 0], x_pca[:, 1], c=dataset_no_outliers[target], cmap='viridis')
plt.xlabel('Componente 1')
plt.ylabel('Componente 2')
plt.title('Proyección PCA coloreada por IMDb Score')
plt.colorbar(label='IMDb Score')
plt.show()
loadings = pd.DataFrame(pca.components_.T,
columns=['PC1', 'PC2'],
index=x_num.columns)
loadings.head(10)
| PC1 | PC2 | |
|---|---|---|
| num_critic_for_reviews | 0.279405 | -0.164931 |
| duration | 0.141880 | -0.235369 |
| director_facebook_likes | 0.128136 | -0.083867 |
| actor_3_facebook_likes | 0.286269 | 0.114801 |
| actor_1_facebook_likes | 0.288075 | -0.032980 |
| gross | 0.278304 | 0.163043 |
| num_voted_users | 0.364319 | -0.044537 |
| cast_total_facebook_likes | 0.320105 | -0.004969 |
| num_user_for_reviews | 0.341804 | -0.124066 |
| budget | 0.313881 | 0.131411 |
loadings['PC1'].sort_values(ascending=False).head(10)
num_voted_users 0.364319 num_user_for_reviews 0.341804 cast_total_facebook_likes 0.320105 budget 0.313881 actor_2_facebook_likes 0.304629 actor_1_facebook_likes 0.288075 actor_3_facebook_likes 0.286269 num_critic_for_reviews 0.279405 gross 0.278304 Action 0.145027 Name: PC1, dtype: float64
El componente principal 1 (PC1) captura el mayor porcentaje de varianza de los datos, es decir, es la dirección de máxima variación en los datos. En este caso, las variables que más contribuyen a la dirección de máxima variación, es decir, que tienen un mayor peso, son referentes a popularidad, visibilidad mediática, impacto económico, presencia en redes sociales y la duración de la película.
Partiendo de tales categorías, se podría decir que PC1 representa una dimensión de "exposición y recepción pública", es decir, qué tan visible, votada, comentada y económicamente exitosa es una película. Las películas con valores altos en PC1 tienden a ser más votadas, más comentadas, más exitosas comercialmente y más visibles en redes sociales.
loadings['PC2'].sort_values(ascending=False).head(10)
Family 0.402911 Animation 0.339887 Comedy 0.321957 Adventure 0.289094 Fantasy 0.264886 gross 0.163043 Musical 0.150288 budget 0.131411 actor_3_facebook_likes 0.114801 actor_2_facebook_likes 0.095359 Name: PC2, dtype: float64
En el componente principal 2 (PC2) se encuentra la segunda dirección de máxima variación en los datos. En este caso, el componente logra capturar la dimensión temática y estilística de las películas. A diferencia del PC1, el PC2 se enfoca más en el contenido narrativo y el tipo de audiencia. En ese sentido, podria decir que PC2 representa una dimensión de "narrativa y audiencia", es decir, qué tan narrativa y popular es una película. Las películas con valores altos en PC2 tienden a estar más enfocadas a públicos amplios o familiares, haciendo uso de animación, fantasía y aventura y comedia, además, tienen una posible asociación con producciones de alto presupuesto pero un enfoque narrativo más ligero.
Dimensionalidad en 3D¶
pca = PCA(n_components=3)
x_pca = pca.fit_transform(x_scaled)
dataset_pca = pd.DataFrame(x_pca, columns=['PC1', 'PC2', 'PC3'])
dataset_pca['IMDb Score'] = dataset_no_outliers['imdb_score']
if is_interactive:
EDAVisualizerInteractive.plot_3d_projection(
dataset=dataset_pca,
title='Proyección PCA en 3D (Interactiva)',
x_label="PC1",
y_label="PC2",
z_label="PC3",
label="IMDb Score"
)
else:
EDAVisualizerStatic.plot_3d_projection(
dataset=dataset_no_outliers,
column='imdb_score',
title='Proyección PCA en 3D coloreada por IMDb Score',
x_label="Componente 1",
y_label="Componente 2",
z_label="Componente 3",
cbar_label="IMDb Score",
figsize=(10, 8)
)
loadings = pd.DataFrame(pca.components_.T,
columns=['PC1', 'PC2', 'PC3'],
index=x_num.columns)
loadings.head(10)
| PC1 | PC2 | PC3 | |
|---|---|---|---|
| num_critic_for_reviews | 0.279405 | -0.164931 | 0.116179 |
| duration | 0.141880 | -0.235369 | -0.147009 |
| director_facebook_likes | 0.128136 | -0.083867 | 0.007356 |
| actor_3_facebook_likes | 0.286269 | 0.114801 | -0.201801 |
| actor_1_facebook_likes | 0.288075 | -0.032980 | -0.236891 |
| gross | 0.278304 | 0.163043 | 0.072336 |
| num_voted_users | 0.364319 | -0.044537 | 0.083702 |
| cast_total_facebook_likes | 0.320105 | -0.004969 | -0.256920 |
| num_user_for_reviews | 0.341804 | -0.124066 | 0.134412 |
| budget | 0.313881 | 0.131411 | 0.069345 |
loadings['PC1'].sort_values(ascending=False).head(10)
num_voted_users 0.364319 num_user_for_reviews 0.341804 cast_total_facebook_likes 0.320105 budget 0.313881 actor_2_facebook_likes 0.304629 actor_1_facebook_likes 0.288075 actor_3_facebook_likes 0.286269 num_critic_for_reviews 0.279405 gross 0.278304 Action 0.145027 Name: PC1, dtype: float64
loadings['PC2'].sort_values(ascending=False).head(10)
Family 0.402911 Animation 0.339887 Comedy 0.321957 Adventure 0.289094 Fantasy 0.264886 gross 0.163043 Musical 0.150288 budget 0.131411 actor_3_facebook_likes 0.114801 actor_2_facebook_likes 0.095359 Name: PC2, dtype: float64
loadings['PC3'].sort_values(ascending=False).head(10)
Horror 0.321810 Thriller 0.308382 Sci-Fi 0.240482 Mystery 0.231936 Action 0.212305 Adventure 0.153874 Fantasy 0.152227 num_user_for_reviews 0.134412 num_critic_for_reviews 0.116179 Animation 0.103772 Name: PC3, dtype: float64
Con relación al componente 3 (PC3), captura aspectos más sutiles o específicos, tal como combinaciones de géneros o temas narrativos. En este caso, se asocia con una dimensión de intensidad narrativa y oscuridad temática, además de que sirve como contraste con el PC2, ya que facilita un análisis entre producciones de alto impacto emocional y aquellas que buscan ser más ligeras y familiares.
La segmentación temática la podemos tomar apoyados con el PC2 y PC3, ya que clasifican películas por tono narrativo. Al comparar el PC1 con el puntaje IMDb, se revela si la popularidad se asocia con calidad.
Con este PCA se logró sintetizar la información de múltiples variables en tres componentes principales que capturan dimensiones latentes del dataset de películas: el primero refleja el impacto público y la recepción masiva (votos, reseñas, éxito comercial), el segundo agrupa temáticas familiares y fantásticas (géneros como animación, aventura y comedia), y el tercero representa narrativas intensas y oscuras (thriller, horror, crimen).
Regresión lineal múltiple¶
Regresión lineal con todas las variables y sin PCA¶
X = dataset_no_outliers.drop(columns=['imdb_score'])
y = dataset_no_outliers['imdb_score']
linearRegressionWithoutPCA = CustomLinearRegression(X, y, "Linear_regression_without_PCA", is_interactive)
linearRegressionWithoutPCA.run()
--> Iniciando la division del dataset Tamaño del dataset: 2349 Tamaño del dataset de entrenamiento: 1879 Tamaño del dataset de prueba: 470 ---------------------------------------- --> Iniciando el entrenamiento del modelo ---------------------------------------- --> Iniciando la predicción del modelo ---------------------------------------- --> Iniciando la evaluación del modelo Error absoluto medio (MAE): 0.55 Error cuadrático medio (MSE): 0.49 Coeficiente de determinación (R²): 0.37 ---------------------------------------- --> Iniciando la creación del dataframe de coeficientes ---------------------------------------- --> Prediciendo sobre entrenamiento y prueba ---------------------------------------- --> Graficando comparación del modelo
---------------------------------------- --> Graficando residuos
---------------------------------------- --> Graficando importance de variables
----------------------------------------
linearRegressionWithoutPCA.summary()
RESUMEN DEL MODELO ---------------------------------------- --> Métricas de desempeño:
| Total Features | MAE | MSE | R2 | |
|---|---|---|---|---|
| 0 | 117 | 0.554463 | 0.489728 | 0.370996 |
--> Principales coeficientes:
| Variable | Coeficiente | |
|---|---|---|
| 110 | content_rating_TV-14 | 1.993295 |
| 112 | content_rating_TV-MA | 1.904206 |
| 84 | country_Iran | 1.774873 |
| 111 | content_rating_TV-G | 1.253610 |
| 50 | language_Hebrew | 1.165313 |
| 19 | Documentary | 1.098341 |
| 54 | language_Japanese | 0.921692 |
| 60 | language_Norwegian | 0.910126 |
| 113 | content_rating_TV-PG | 0.908675 |
| 44 | language_Dutch | 0.767355 |
Regresión lineal con PCA¶
X = x_pca
y = dataset_no_outliers['imdb_score']
linearRegressionWithPCA = CustomLinearRegression(X, y, "Linear_regression_with_PCA", is_interactive)
linearRegressionWithPCA.run()
--> Iniciando la division del dataset Tamaño del dataset: 2349 Tamaño del dataset de entrenamiento: 1879 Tamaño del dataset de prueba: 470 ---------------------------------------- --> Iniciando el entrenamiento del modelo ---------------------------------------- --> Iniciando la predicción del modelo ---------------------------------------- --> Iniciando la evaluación del modelo Error absoluto medio (MAE): 0.55 Error cuadrático medio (MSE): 0.49 Coeficiente de determinación (R²): 0.37 ---------------------------------------- --> Iniciando la creación del dataframe de coeficientes ---------------------------------------- --> Prediciendo sobre entrenamiento y prueba ---------------------------------------- --> Graficando comparación del modelo
---------------------------------------- --> Graficando residuos
---------------------------------------- --> Graficando importance de variables
----------------------------------------
linearRegressionWithPCA.summary()
RESUMEN DEL MODELO ---------------------------------------- --> Métricas de desempeño:
| Total Features | MAE | MSE | R2 | |
|---|---|---|---|---|
| 0 | 3 | 0.694786 | 0.741951 | 0.047041 |
--> Principales coeficientes:
| Variable | Coeficiente | |
|---|---|---|
| 0 | PC1 | -0.023093 |
| 2 | PC3 | -0.105278 |
| 1 | PC2 | -0.163191 |
Pruebas con otros modelos¶
Regresión Random Forest¶
X = dataset_no_outliers[features]
y = dataset_no_outliers[target]
# Entrenar modelo
rf = RandomForestRegressor(n_estimators=100, random_state=42)
rf.fit(X, y)
# Predecir sobre el mismo conjunto (evaluación interna)
y_pred_rf = rf.predict(X)
# Métricas de evaluación
mse = mean_squared_error(y, y_pred_rf)
r2 = r2_score(y, y_pred_rf)
print(f"Error cuadrático medio (MSE): {mse:.2f}")
print(f"Coeficiente de determinación (R²): {r2:.2f}")
Error cuadrático medio (MSE): 0.07 Coeficiente de determinación (R²): 0.92
Regresión Gradient Boosting¶
X = dataset_no_outliers[features]
y = dataset_no_outliers[target]
# Entrenar modelo
gb = GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, random_state=42)
gb.fit(X, y)
# Predecir sobre el mismo conjunto (evaluación interna)
y_pred_gb = gb.predict(X)
# Métricas de evaluación
mse_gb = mean_squared_error(y, y_pred_gb)
r2_gb = r2_score(y, y_pred_gb)
print(f"Error cuadrático medio (MSE): {mse_gb:.2f}")
print(f"Coeficiente de determinación (R²): {r2_gb:.2f}")
Error cuadrático medio (MSE): 0.31 Coeficiente de determinación (R²): 0.65
Support Vector Machine¶
X = dataset_no_outliers[features]
y = dataset_no_outliers[target]
# Entrenar modelo SVR
svr = SVR(kernel='rbf', C=1.0, epsilon=0.2)
svr.fit(X, y)
# Predecir sobre el mismo conjunto (evaluación interna)
y_pred_svr = svr.predict(X)
# Métricas de evaluación
mse_svr = mean_squared_error(y, y_pred_svr)
r2_svr = r2_score(y, y_pred_svr)
print(f"Error cuadrático medio (MSE): {mse_svr:.2f}")
print(f"Coeficiente de determinación (R²): {r2_svr:.2f}")
Error cuadrático medio (MSE): 0.84 Coeficiente de determinación (R²): 0.04
K Nearest Neighbors¶
X = dataset_no_outliers[features]
y = dataset_no_outliers[target]
# Entrenar modelo KNN
knn = KNeighborsRegressor(n_neighbors=5)
knn.fit(X, y)
# Predecir sobre el mismo conjunto (evaluación interna)
y_pred_knn = knn.predict(X)
# Métricas de evaluación
mse_knn = mean_squared_error(y, y_pred_knn)
r2_knn = r2_score(y, y_pred_knn)
print(f"Error cuadrático medio (MSE): {mse_knn:.2f}")
print(f"Coeficiente de determinación (R²): {r2_knn:.2f}")
Error cuadrático medio (MSE): 0.66 Coeficiente de determinación (R²): 0.25
Ridge¶
X = dataset_no_outliers[features]
y = dataset_no_outliers[target]
# Entrenar modelo Ridge
ridge = Ridge(alpha=1.0)
ridge.fit(X, y)
# Predecir sobre el mismo conjunto (evaluación interna)
y_pred_ridge = ridge.predict(X)
# Métricas de evaluación
mse_ridge = mean_squared_error(y, y_pred_ridge)
r2_ridge = r2_score(y, y_pred_ridge)
print(f"Error cuadrático medio (MSE Ridge): {mse_ridge:.2f}")
print(f"Coeficiente de determinación (R² Ridge): {r2_ridge:.2f}")
Error cuadrático medio (MSE Ridge): 0.46 Coeficiente de determinación (R² Ridge): 0.47
c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.20624e-19): result may not be accurate.
Prueba con alta correlación¶
X = dataset_no_outliers[features]
y = dataset_no_outliers[target]
models = {
'Linear': LinearRegression(),
'Random Forest': RandomForestRegressor(n_estimators=100, random_state=42),
'Gradient Boosting': GradientBoostingRegressor(n_estimators=100, learning_rate=0.1, random_state=42),
'SVR': SVR(kernel='rbf', C=1.0, epsilon=0.2),
'K Neighbors': KNeighborsRegressor(n_neighbors=5),
'Ridge': Ridge(alpha=1.0)
}
results = {
'Modelo': [],
'R²': [],
'MSE': []
}
for name, model in models.items():
r2_scores = cross_val_score(model, X, y, cv=5, scoring='r2').mean()
mse_scores = -cross_val_score(model, X, y, cv=5, scoring='neg_mean_squared_error').mean()
print(f"{name} → R²: {r2_scores:.3f}, MSE: {mse_scores:.3f}")
results['Modelo'].append(name)
results['R²'].append(round(r2_scores.mean(), 3))
results['MSE'].append(round(-mse_scores.mean(), 2))
fig = go.Figure()
# R²
fig.add_trace(go.Bar(
x=results['Modelo'],
y=results['R²'],
name='R²',
marker_color='mediumseagreen',
text=results['R²'],
textposition='auto'
))
# MSE
fig.add_trace(go.Bar(
x=results['Modelo'],
y=results['MSE'],
name='MSE',
marker_color='indianred',
text=results['MSE'],
textposition='auto'
))
# Configuración del gráfico
fig.update_layout(
title='Comparación de Modelos de Regresión (Validación Cruzada)',
barmode='group',
xaxis_title='Modelo',
yaxis_title='Valor',
template='plotly_white',
width=1000,
height=600
)
fig.show()
fig_title = "assets/Validacion_cruzada_modelos.html"
fig.write_html(fig_title)
print("Gráfica guardada en", fig_title)
Linear → R²: 0.352, MSE: 0.555 Random Forest → R²: 0.398, MSE: 0.513 Gradient Boosting → R²: 0.438, MSE: 0.482 SVR → R²: -0.024, MSE: 0.868 K Neighbors → R²: -0.167, MSE: 0.990 Ridge → R²: 0.373, MSE: 0.536
c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=6.0454e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.5912e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.50027e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.17022e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.6825e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=6.0454e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.5912e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.50027e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.17022e-19): result may not be accurate. c:\Users\cpaez\AppData\Local\Programs\Python\Python313\Lib\site-packages\scipy\_lib\_util.py:1233: LinAlgWarning: Ill-conditioned matrix (rcond=3.6825e-19): result may not be accurate.
Gráfica guardada en assets/Validacion_cruzada_modelos.html
¿Cómo interpretar?
- R² (coeficiente de determinación): mide qué proporción de la varianza de la variable objetivo es explicada por el modelo.
- Valores cercanos a 1 indican buen ajuste.
- Valores negativos (como en SVR y KNN) indican que el modelo rinde peor que una línea horizontal (promedio de los datos).
- MSE (error cuadrático medio): mide el promedio del cuadrado de los errores.
- Cuanto más bajo, mejor.
- Es sensible a valores extremos.
Diagnóstico técnico
- Gradient Boosting es el mejor modelo en este conjunto, con el mayor R² y menor MSE.
- SVR y KNN tienen mal desempeño, probablemente por:
- Falta de escalado de variables (ambos son sensibles a la escala).
- Alta dimensionalidad o ruido en los datos.
- Ridge mejora ligeramente sobre Linear al controlar la colinealidad.